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ation of application Ser. No. 08/649,419, filed May 16, 1996, 
now U.S. Pat. No. 5,862,260. These prior applications are 
incorporated herein by reference. 

AppUcation Ser. No. 08/649,419 is a continuation in part 
of PCT/US96/06618, filed May 7, 1996, U.S. application 
Ser. No. 08/637,531, filed Apr. 25, 1996 (Now U.S. Pat. No. 
5,822,436), U.S. application Ser. No. 08/534,005, filed Sep. 
25, 1995, (Now U.S. Pat. No. 5,832,119), and U.S. appU- 
cation Ser. No. 08/436,102, filed May 8, 1995, (Now U.S. 
Pat. No. 5,748,783). 

This application is also a continuation in part of applica- 
tion Ser. No. 09/482,749, filed Jan. 13, 2000. 

TECHNICAL FIELD 
The invention relates to digital watermarking of media 
content, such as images, audio and video. 

BACKGROUND AND SUMMARY 
Digital watermarking is a process for modifying media 
content to embed a machine-readable code into the data 
content. The data may be modified such that the embedded 
code is imperceptible or nearly imperceptible to the user, yet 
may be detected through an automated detection process. 
Most commonly, digital watermarking is applied to media 
such as images, audio signals, and video signals. However, 
it may also be applied to other types of data, including 
documents (e.g., through line, word or character shifting), 
software, multi-dimensional graphics models, and surface 
textures of objects. 

Digital watermarking systems have two primary compo- 
nents: an embedding component that embeds the watermark 
in the media content, and a reading component that detects 
and reads the embedded watermark. The embedding com- 
ponent embeds a watermark pattern by altering data samples 
of the media content. The reading component analyzes 
content to detect whether a watermark pattern is present. In 
applications where the watermark encodes information, the 
reader extracts this information from the detected water- 
One challenge to the developers of watermark emt)cdding 
and reading systems is to ensure that the watermark is 
detectable even if the watermarked media content is trans- 
formed in some fashion. The watermark may be corrupted 
intentionally, so as to bypass its copy protection or anti- 
counterfeiting functions, or unintentionally tiirough various 
transformations that resuh fi-om routine manipulation of the 
content In the case of watermarked images, such manipu- 
lation of the image may distort the watermark pattern 
embedded in the image. 

The invention provides watermark structures, and related 
embedders, detectors, and readers for processing the water- 
mark structures. In addition, it provides a variety of methods 
and applications associated with the watermark structures, 
embedders, detectors and readers. While adapted for images, 
the watermark system applies to other electronic and physi- 
cal media. For example, it can be applied to electronic 
objects, including image, audio and video signals. It can be 
applied to mark blank paper, film and other substrates, and 
it can be applied by texturing object surfaces for a variety of 
applications, such as identification, authentication, etc. The 



4,914 Bl 

2 

detector and reader can operate on a signal captured from a 
physical object, even if that captured signal is distorted. 

The watermark structure can have multiple components, 
each having different attributes. To name a few, tiiese 

5 attributes include function, signal intensity, transform 
domain of watermark definition (e.g., temporal, spatial, 
frequency, etc.), location or orientation in host signal, 
redundancy, level of security (e.g., encrypted or scrambled). 
When describing a watermark signal in the context of this 

10 document, intensity refers to an embedding level while 
strength describes reading level (though the terms are some- 
times used interchangeably). The components of the water- 
mark structure may perform the same or different functions. 
For example, one component may carry a message, while 

15 another component may serve to identify the location or 
orientation of the watermark in a combined signal. 
Moreover, different messages may be encoded in different 
temporal or spatial portions of the host signal, such as 
different locations in an image or different time frames of 

20 audio or video. 

Watermark components may have different signal inten- 
sities. For example, one component may carry a longer 
message, yet have smaller signal intensity than another 
component, or vice-versa. The embedder may adjust the 
signal intensity by encoding one component more redun- 
dantiy than others, or by applying a different gain to the 
components. Additionally, watermark components may be 
defined in different transform domains. One may be defined 
in a frequency domain, while another may be defined in a 
spatial or temporal domain. 

The watermark components may be located in different 
spatial or temporal locations in the host signal. In images, for 
example, different components may be located in different 
parts of the image. Each component may carry a different 
message or perform a different function. In audio or video, 
different components may be located in different time 
firames of the signal. 
The watermark components may be defined, embedded 

4(, and extracted in different domains. Examples of domains 
include spatial, temporal and frequency domains. A water- 
mark may be defined in a domain by specifying how it alters 
the host signal in that domain to effect the encoding of the 
watermark component. A frequency domain component 

45 alters the signal in the frequency domain, while a spatial 
domain component alters the signal in the spatial domain. Of 
course, such alterations may have an impact that extends 
across many transform domains. 
While described here as watermark components, one can 

50 ako construe the components to be different watermarks. 
This enables the watermark technology described through- 
out this document to be used in appUcations using two or 
more watermarks. For example, some copy protection appU- 
cations of the watermark structure may use two or more 

55 watermarks, each performing similar or different function. 
One mark may be more fragile than another, and thus, 
disappear when the combined signal is corrupted or trans- 
formed in some fashion. The presence or lade of a water- 
mark or watermark component conveys information to the 

60 detector to initiate or prohibit some action, such as playback, 
copying or recording of the marked signal. 

A watermark system may include an embedder, detector, 
and reader. The watermark embedder encodes a watermark 
signal in a host signal to create a combined signal. The 

65 detector looks for the watermark signal in a potentially 
corrupted version of the combined signal, and computes its 
orientation. Finally, a reader extracts a message in the 
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watennark signal from the combined signal using the ori- FIG. 8 is a diagram illustrating an implementation of a 

entation to approximate the original state of the combined watermark embedder. 

^iS°^l- FIG. 9 is a diagram depicting an assignment map used to 

There are a variety of alternative embodiments of the map raw bits in a message to locations within a host image, 

embedder and detector. One embodiment of the embedder 5 pjQ j„ juustrates an example of a watermark orientation 

performs error correction coding of a binary message, and signal in a spatial frequency domain, 

then combines the binary message with a carrier signal to ^ ^j^^ ^ncnU^ion signal shown in FIG. 10 

create a component of a watermark signal. It then combmes ^ spatial domain 

the watermark signal with a host signal. To facilitate ^ ^\ * 

detection, it may also add a detection component to form a lo FIG- 12 is a diagram illustratmg an overview of a water- 
composite watermark signal having a message and detection '"^^^ detector implementation. 

component. The message component includes known or FIG. 13 is a diagram illustrating an implementation of the 

signature bits to facilitate detection, and thus, serves a dual detector pre-processor depicted generally in FIG. 12. 

function of identifying the mark and conveying a message. FIG. 14 is a diagram illustrating a process for estimating 

The detection component is designed to identify the orien- 15 rotation and scale vectors of a detection watermark signal, 

tation of the watermark in the combined signal, but may piG. 15 is a diagram illustrating a process for refining the 

carry an information signal as well. For example, the signal rotation and scale vectors, and for estimating differential 

values at selected locations in the detection component can parameters of the detection watermaik signal, 

be altered to encode a message. . ^ 

^ FIG. 16 is a diagram illustratmg a process for aggregating 

One embodiment of the detector estimates an initial ^° evidence of the orientation signal and orientation parameter 

onentation of a watermark signal in the multidimensional candidates from two or more frames, 

signal, and refines the initial orientation to compute a refined . ,. ... , .. r .■ 

orientation. As part of the process of refining the orientation, , ;. " ' '^T'™ fTTf^,' t 

this detector computes at least one orientation parameter that parameters of the detection watermark signal, 

increases correlation between the watermark signal and the 25 FIG. 18 is a diagram illustratmg a process for refining 

multidimensional signal when the watermark or muWdi- orientation parameters using known message bits in the 

mensional signal is adjusted with the refined orientation. watermark message. 

Another detector embodiment computes orientation FIG. 19 is a diagram illusfrating a process for reading a 

parameter candidates of a watermark signal in different watermark message from an image, after re-orienting the 

portions of the target signal, and compares the similarity of ^° ™age data using an orientation vector, 

orientation parameter candidates from the different portions. FIG. 20 is a diagram of a computer system that serves as 

Based on this comparison, it determines which candidates an operating environment for software implementations of a 

are more likely to correspond to a valid watermark signal. watermark embedder, detector and reader. 

Yet another detector embodiment estimates orientation of FIG. 21 is a diagram illustrating that a watermark embed- 

the watermark in a target signal suspected of having a ded in a signal may be used to carry auxiliary data in the 



watermark. The detector then uses the orientation to extract signal and a pointer to auxiliary data in a database, 
a measure of the watermark in the target. It uses the measure FIG. 22 is a diagram illustrating that different regions of 
of the watermark to assess merits of the estimated onenta- image may carry different watermark information. 



tioa In one irnplernentation, the measure of the watermark ^ ^ ^ ^ illustrating how digital watermarks 

IS the extent to which message bits read from the target u jj j • • . j- i. j . i- f .u • . j- 

. J 1?. A .L -ft. embedded m pnnt media may be used to link the prmt media 

signal match with expected bits. Another measure is the ,„ • ■' ^ 

. ■ ■ ..... to electronic commerce, 

extent to which values of the target signal are consistent with 

the watermark signal. The measure of the watermark signal 24A is a diagram illustrating a method for embed- 

piDvides information about the merits of a given orienUtion digital watermarks m pnnt media. 

that can be used to find a better estimate of the orientation. FIG. 24B is a diagram illustrating a method for reading an 

Further advantages and features of the invention will embedded digital watermark in print media, 
become apparent with reference to the following detailed 

description and accompanying drawings. DETAILED DESCRIPTION 

BRIEF DESCRIPTION OF THE DRAWINGS so i.o Introduction 

FIG. 1 is a block diagram Ulustrating an image watermark ^ watermark can be viewed as an information signal that 
is embedded in a host signal, such as an image, audio, or 

FIG. 2 IS a block diagram illustrating an image watermark ^^^^ ^^^^^ j^^^nt. Watermarking systems based on 

embedder. the following detailed description may include the following 

FIG. 3 is a ^atial frequency domain plot of a detection components: 1) An embedder that inserts a watermark signal 

watermark signal. in the host signal to form a combined signal; 2) A detector 

FIG. 4 is a flow diagram of a process for detecting a that determines the presence and orientation of a watennark 

watermark signal in an image and computing its orientation in a potentially corrupted version of the combined signal; 

within the image. and 3) A reader that extracts a watermark message from the 

FIG. 5 is a flow diagram of a process reading a message combined signal. In some implementations, the detector and 

encoded in a watermark. reader are combined. 

FIG. 6 is a diagram depicting an example of a watermark The structure and complexity of the watermark signal can 

detection process. vary significantly, depending on the application. For 

FIG. 7 is a diagram depicting the orientation of a trans- 65 example, the watermark may be comprised of one or more 

formed image superimposed over the original orientation of signal components, each defined in the same or different 

the image at the time of watermark encoding. domains. Each component may perform one or more fiinc- 
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tions. Two primary functions include acting as an identifier impact on the message and orientation functions of the 

to facilitate detection and acting as an information carrier to watermark or its components. For example, in a watermark 

convey a message. In addition, components may be located system described below, the embedder calculates a different 

in different spatial or temporal portions of the host signal, gain for orientation and message components of an image 
and may carry the same or different messages. 5 watermark. 

The host signal can vary as well. The host is typically Another useful tool in watermark embedding and reading 

some form of multi-dimensional media signal, such as an is perceptual analysis. Perceptual analysis refers generally to 

image, audio sequence or video sequence. In the digital techniques for evaluating signal properties based on the 

domain, each of these media types is represented as a extent to which those properties are (or are likely to be) 
multi-dimensional array of discrete samples. For example, a 10 perceptible to htmians (e.g., listeners or viewers of the media 

color image has spatial dimensions (e.g., its horizontal and content). A watermark embedder can take advantage of a 

vertical components), and color space dimensions (e.g.. Human \^ual System (HVS) model to determine where to 

YUV or RGB). Some signals, like video, have spatial and place a watermark and how to control the intensity of the 

temporal dimensions. Depending on the needs of a particular watermark so that chances of accurately recovering the 
application, the embedder may insert a watermark signal 15 watermark are enhanced, resistance to tampering is 

that exists in one or more of these dimensions. increased, and perceptibility of the watermark is reduced. 

In the design of the watermark and its components, Such perceptual analysis can play an integral role in gain 

developers are faced with several design issues such as: the control because it helps indicate how the gain can be 

extent to which the mark is impervious to jamming and adjusted relative to the impact on the perceptibiUty of the 
manipulation (either intentional or luintentional); the extent 2" mark. Perceptual analysis can also play an integral role in 

of imperceptibOity; the quantity of information content; the locating the watermark in a host signal. For example, one 

extent to which the mark facilitates detection and recovery, might design the embedder to hide a watermark in portions 

and the extent to which the information content can be of a host signal that are more likely to mask the mark from 

recovered accurately. htunan perception. 

For certain applications, such as copy protection or Variousformsof statistical analyses may be performed on 

authentication, the watermark should be difScult to tamper a signal to identify places to locate the watermark, and to 

with or remove by those seeking to circumvent it. To be identify places where to extract the watermark. For example, 

robust, the watermark must withstand routine manipulation, a statistical analysis can identify portions of a host image 

such as data compression, copying, linear transformation, that have noise-like properties that are likely to make 

flipping, inversion, etc., and intentional manipulation recovery of the watermark signal difEcult. Similarly, statis- 

intended to remove the mark or make it undetectable. Some tical analyses may be used to characterize the host signal to 

applications require the watermark signal to remain robust determine where to locate the watermark, 

through digital to analog conversion (e.g., printing an image Each of the techniques may be used alone, in various 

or playing music), and analog to digital conversion (e.g., combinations, and in combination with other signal process- 

scanmng the image or digitally sampling the music). In some jng techniques. 

cases, it is beneficial for the watermarking technique to i„ ^jjtion to selecting the appropriate signal processing 

withstand repeated watermarkmg. techniques, the developer is feced with other design con 

A variety of signal processmg techniques may be applied siderations. One consideration is the nature and format of the 

to address some or all of these design considerations. One media content. In the case of digital images, for example, the 

such technique is referred to as spreading. Sometimes cat- image data is typically represented as an array of image 

egorized as a spread spectrum technique, spreading is a way samples. Color images are represented as an array of color 

to distribute a message into a number of components (chips), vectors in a color space, such as RGB or YUV. The water- 

which together make up the entire message. Spreading mark may be embedded in one or more of the color 

makes the mark more impervious to jamming and components of an image. In some implementations, the 

manipulation, and makes it less perceptible. embedder may transform the input image into a target color 

Another category of signal processing technique is error space, and then proceed with the embedding process in that 
correction and detection coding. Error correction coding is color space, 
useful to reconstruct the message accurately from the water- 
mark signal. Error detection coding enables the decoder to 50 2.0 Digital Watermark Embedder and Reader 
determine when the extracted message has an error. Overview 

Another signal processing technique that is useful in The following sections describe implementations of a 

watermark coding is called scattering. Scattering is a method watermark embedder and reader that operate on digital 

of distributing the message or its components among an signals. The embedder encodes a message into a digital 
array of locations in a particular transform domain, such as 55 signal by modifying its sample values such that the message 

a spatial domain or a spatial frequency domain. Uke is imperceptible to the ordinary observer in output form. To 

spreading, scattering makes the watermark less perceptible extract the message, the reader captures a representation of 

and more impervious to manipulation. the signal suspected of containing a watermark and then 

Yet another signal processing technique is gain control. processes it to detect the watermark and decode the message. 
Gain conUol is used to adjust the intensity of the watermark 60 FIG. 1 is a block diagram summarizing signal processing 

signal. The intensity of the signal impacts a number of operations involved in embedding and reading a watermark, 

aspects of watermark coding, including its perceptibiUty to There are three primary inputs to the embedding process: the 

the ordinary observer, and the ability to detect the mark and original, digitized signal 100, the message 102, and a series 

accurately recover the message from it. of control parameters 104. The control parameters may 

Gain control can impact the various functions and com- 65 include one or more keys. One key or set of keys may be 

ponents of the watermark differently. Thus, in some cases, it used to encrypt the message. Another key or set of keys may 

is useful to control the gain while taking into account its be used to control the generation of a watermark carrier 
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signal or a mapping of information bits in the message to example, a bit value may be encoded as a one ot zero by 

positions in a watermark information signal. quantizing the value of a selected sample to be even or odd. 

The carrier signal or mapping of the message to the host As another example, the embedder might compute a check- 
signal may be encrypted as weU. Such encryption may sum or parity of an N bit pixel value or transform coefScient 
increase security by varying the carrier or mapping for 5 and then set the least significant bit to the value of the 
different components of the watermark or watermark mes- checksum or parity. Of course, if the signal already corre- 
sage. Similarly, if the watermark or watermark message is spends to the desired message bit value, it need not be 
redundantly encoded throughout the host signal, one or more altered. The same approach can be extended to a set of signal 
encryption keys can be used to scramble the carrier or signal samples where some attribute of the set is adjusted as 
mapping for each instance of the redtmdantly encoded necessary to encode a desired message symbol. These tech- 
watermark. This use of encryption provides one way to vary niques can be applied to signal samples in a transform 
the encoding of each instance of the redundantly encoded domain (e.g., transform coefficients) or samples in the 
message in the host signal. Other parameters may include temporal or spatial domains. 

control bits added to the message, and watermark signal Quantization index modulation techniques employ a set of 
attributes (e.g., orientation or other detection patterns) used 15 quantizers. In these techniques, the message to be transmit- 
to assist in the detection of the watermark. ted is used as an index for quantizer selection. In the 
^art from encrypting or scrambling the carrier and decoding process, a distance metric is evaluated for all 
mapping information, the embedder may apply different, quantizers and the index with the smallest distance identifies 
and possibly unique carrier or mapping for different com- the message value. 

ponents of a message, for different messages, or from The watermark detector 110 operates on a digitized signal 

different watermarks or watermark components to be suspected of containing a watermark. As depicted generally 

embedded in the host signal. For example, one watermark in FIG. 1, the su^ect signal may tmdergo various transfor- 

may be encoded in a block ofsamples with one carrier, while mations 112, such as conversion to and from an analog 

another, possibly different watermark, is encoded in a dif- domain, cropping, copying, editing, compression/ 

ferent block with a different carrier. A similar approach is to decompression, transmission etc. Using parameters 114 

use different mappings in different blocks of the host signal. from ttie embedder (e.g., orientation pattern, control bits. 

The watermark embedding process 106 converts the mes- key(s)). f performs a series of correlation or other operations 

sage to a watermark information signal. It then combines on the captured image to detect the presence of a watermark, 

this signal with the input signal and possibly another signal If it finds a watermark, it determines its orientation within 

(e.g., an orientation pattern) to create a watermarked signal the suspect signal. 

108. The process of combining the watermark with the input Using the orientation, if necessary, the reader 116 extracts 

signal may be a linear or non-linear function. Examples of the message. Some implementations do not perform 

watermarking fiinctions include: S*-S+gX; S*-S(l+gX); correlation, but instead, use some other detection process or 

and S*-Se^; where S* is the watermarked signal vector, S proceed directly to extract the watermark signal. For 

is the input signal vector, and g is a function controlling instance in some applications, a reader may be invoked one 

watermark intensity. The watermark may be applied by or more times at various temporal or spatial locations in an 

modulating signal samples S in the spatial, temporal or some attempt to read the watermark, without a separate pre- 

other transform domain. processing stage to detect the watermark's orientation. 

To encode a message, the watermark encoder analyzes 40 Some implementations require the original, 
and selectively adjusts the host signal to give it attributes un-watermarked signal to decode a watermark message, 
that correspond to the desired message symbol or symbols to while others do not. In those approaches where the original 
be encoded. There are many signal attributes that may signal is not necessary, the original un-watermarked signal 
encode a message symbol, such as a positive or negative can still be used to improve the accuracy of message 
polarity of signal samples or a set of samples, a given parity 45 recovery. For example, the original signal can be removed, 
(odd or even), a given difference value or polarity of the leaving a residual signal from which the watermark message 
difference between signal samples (e.g., a difference is recovered. If the decoder does not have the original signal, 
between selected spatial intensity values or transform it can still attempt to remove portions of it (e.g., by filtering) 
coefScients), a given distance value between watermarks, a that are expected not to contain the watermark signal, 
given phase or phase ofEset between different watemiark 50 Watermark decoder implementations use known relation- 
components, a modulation of the phase of the host signal, a ships between a watermark signal and a message symbol to 
modulation of frequency coefficients of the host signal, a extract estimates of message symbol values from a signal 
given frequency pattern, a given quantizer (e.g., in Quanti- suspected of containing a watermark. The decoder has 
zation Index Modulation) etc. knowledge of the properties of message symbols and how 

Some processes for combining the watermark with the 55 and where they are encoded into the host signal to encode a 
input signal are termed non- linear, such as processes that message. For example, it knows how message bit values of 
employ dither modulation, modify least significant bits, or one and a zero are encoded and it knows where these 
apply quantization index modulation. One type of non-linear message bits are originally encoded. Based on this 
modulation is where the embedder sets signal values so that information, it can look for Uie message properties in the 
they have some desired value or characteristic correspond- 60 watermarked signal. For example, it can test the water- 
ing to a message symbol. For example, the embedder may marked signal to see if it has attributes of each message 
designate that a portion of the host signal is to encode a symbol (e.g., a one or zero) at a particular location and 
given bit value. It then evaluates a signal value or set of generate a probability measure as an indicator of the like- 
values in that portion to determine whether they have the lihood that a message symbol has been encoded. Knowing 
attribute corresponding to the message bit to be encoded. 65 the approximate location of the watermark in the water- 
Some examples of attributes include a positive or negative marked signal, the reader implementation may compare 
polarity, a value that is odd or even, a checksum, etc. For known message properties with the properties of the water- 
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marked signal to estimate message values, even if the different message in different locations of the signal The 
on^nal signal is unavailable. Distortions to the water- carrier signal may be a noise image. For each raw bit, the 
m^ked signal and the host signal itself make the watermark assignment map specifies the corresponding image sample 
difScultto recover, but accurate recovery of the messagecan or samples that will be modified to ^code that bit. 
be enhanced usmg a variety of techmques, such as error 5 n,e embedder depicted in FIG. 2 operates on blocks of 
oorrecuon coing, watermark signal prediction, redundant i^^ge data (referred to as 'tiles') and replicates a watermark 
message enoodmg, etc. ^ ^^^b of these blocks. As such, the carrier signal and 

One way to recover a message value from a watermarked assignment map both correspond to an image block of a 
signal IS to perform correlation between the known message pre-determined size, namely, the size of the tile. To encode 
property of each message symbol and the watermarked 10 each bit, the embedder applies the assignment map to 
signal. If the amount of correlation exceeds a threshold, for determine the corresponding image samples in the block to 
example, then the watermarked signal may be assumed to be modified to encode that bit. Using the map, it finds the 
contain tiie rnessage symbol. The same process can be corresponding image samples in the carrier signal. For each 
repeated for different symbols at various locations to extract bit, the embedder computes the value of image samples in 
a message. Asymbol (e.g., a bmary value of one or zero) or 15 the watermark information signal as a function of the raw bit 
set of symbols may be encoded redundantly to enhance value and the value(s) of the corresponding samples in the 
message recovery. carrier signal. 

In some cases, it is useful to filter the watermarked signal To illustrate the embedding process fiirther, it is helpful to 
to remove aspects of the signal that are unlikely to be helpful consider an example. First, consider the following back- 
m recovering the message and/or are likely to interfere with 20 ground. Digital watermarking processes are sometimes 
the watermark message. For example, the decoder can filter described in terms of the transform domain in which the 
out portions of the original signal and another watermark watermark signal is defined. The watermark may be defined 
signal or signals. In addition, when the original signal is in the spatial or temporal domain, or some other transform 
unavaiUble, the reader can estimate or predict the original domain such as a wavelet transform. Discrete Cosine Trans- 
signal based on properties of the watermarked signal. The 25 form (DCT), Discrete Fourier Transform (DFT), Hadamard 
onginal or predicted version of the original signal can then transform. Hartley transform, Karhunen-Loeve transform 
be used to recover an estimate of the watermark message. (KLT) domain, etc. 

One way to use the predicted version to recover the water- Consider an example where the watermark is defined in a 
mark is to remove the predicted version before reading the transform domain (e.g., a frequency domain such as DCT, 
desu-ed watermark. Similarly, the decoder can predict and 30 wavelet or DFT). The embedder segments the image in the 
remove un-wanted watermarks or watermark components spatial domain into rectangular tiles and transforms the 
before reading the desired watermark in a signal having two image samples in each tile into the transform domain. For 
or more watermarks. example in the DCT domain, the embedder segments the 

2.1 Image Watermark Embedder image into N by N blocks and transforms each block into an 

FIG. 2 IS a block diagram illustrating an implementation 35 N by N block of DCT coefficients. In this example, the 
of an exemplary embedder in more detail. The embedding assignment map specifies the corresponding sample location 
process begins with the message 200. As noted above, the or locations in the frequency domain of the tile that corre- 
message is binary number suitable for conversion to a spond to a bit position in the raw bits, hi the frequency 
watermark signal. For additional security, the message, its domain, the carrier signal looks hke a noise pattern. Each 
carrier, and the mapping of the watermark to the host signal 40 image sample in the frequency domain of the carrier signal 
may be encrypted with an encryption key 202. In addition to is used together with a selected raw bit value to compute the 
the mformation conveyed in the message, the embedder may value of the image sample at the location in the watermark 
also add control bit values ("signahire bits") to the message information signal. 

to assist in verifying the accuracy of a read operation. These Now consider an example where the watermark is defined 
control bits, along with the bits representing the message, 45 in the spatial domain. The embedder segments the image in 
are input to an error correction coding process 204 designed the spatial domain into rectangular tiles of image samples 
to increase the likelihood that the message can be recovered (i.e. pixels). In this example, the assignment map specifies 
accurately in the reader. the corresponding sample location or locations in the tile 

There are several alternative error correction coding that correspond to each bit position in the raw bits. In the 
schemes that may be employed. Some examples include so spatial domain, the carrier signal looks Hke a noise pattern 
BCH, convolution. Reed Solomon and hirbo codes. These extending throughout the tile. Each image sample in the 
forms of error correction coding are sometimes used in spatial domain of the carrier signal is used together with a 
communication applications where data is encoded in a selected raw bit value to compute the value of the image 
carrier signal that transfers the encoded data from one place sample at the same location in the watermark information 
to another. In the digital watermarking application discussed 55 signal. 

here, the raw bit data is encoded in a fundamental carrier With this background, the embedder proceeds to encode 
^'Snal. each raw bit in the selected transform domain as follows. If 

In addition to the error correction coding schemes men- uses the assignment map to look up the position of the 
tioned above, the embedder and reader may also use a Cyclic corresponding image sample (or samples) in the carrier 
Redundancy Check (CRC) to faciliute detection of errors in 60 signal. The image sample value at that position in the carrier 
the decoded message data. controls the value of the corresponding position in the 



a coding function 204 produces a watermark information signal. In particular, the carrier 

strmg of bits, termed raw bits 206, that are embedded into a sample value indicates whether to invert the corresponding 

watermark information signal. Using a carrier signal 208 watemiark sample value. The raw bit value is either a one or 

and an assignment map 210, the illustrated embedder 65 zero. Disregarding for a moment the impact of the can-ier 

encodes the raw bits in a watermark information signal 212, signal, the embedder adjusts the con-esponding watermark 

214. In some appHcations, the embedder may encode a sample upward to represent a one, or downward to represent 
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a zero. Now, if the carrier signal indicates that the corre- the impulse functions should preferably not fall on the 

sponding sample should be inverted, the embedder adjusts vertical or horizontal axes, and each impulse function should 

the watermark sample downward to represent a one, and have a unique horizontal and vertical location. While the 

upward to represent a zero. In this manner, the embedder example depicted in FIG. 3 shov« that some of the impulse 

computes the value of the watermark samples for a raw bit 5 functions fall on the same horizontal axis, it is trivial to alter 

using the assignment map to find the ^atial location of those t^e position of the impulse functions such that each has a 

samples within the block. vertical or horizontal coordinate. 

From this example, a number of points can be made. First, Returning to FIG. 2, the embedder makes a perceptual 
the embedder may perform a similar approach in any other f^''^^ 218 of the input image 220 to identify portions of 
transform domain. Second, for each raw bit, the correspond- lO '^^^^'1^°'^. watermark signal content 
ing watermark sample or samples are some function of the T ^^^^^^^^V .•mpacUng nna^e fidehty. Generally, 
1 J .u • ■ 1 1 nm. vc the perceptual analysis employs a HVS model to identify 
raw bit value and the earner signed value. TTie specific ^^^^ and^or spatial areas to increase or 
mathematical relaUonship between the watemiark sample. j^^^^ watermark signal intensity to make the watermark 
on one hand, and the raw bit value and earner signal, on the imperceptible to an ordinary observer. One type of model is 
other, may vary with the implementation. For example, the 15 ^ increase watermark intensity in frequency bands and 
message may be convolved with the carrier, multiplied with spatial areas where there is more image activity. In these 
the carrier, added to the carrier, or applied based on another areas, the sample values are changing more than other areas 
non-linear function. Third, the carrier signal may remain and have more signal strength. The output of the perceptual 
constant for a particular application, or it may vary from one analysis is a perceptual mask 222. The mask may be 
message to another. For example, a secret key may be used 20 implemented as an array of functions, which selectively 
to generate the carrier signal. For each raw bit, the assign- increase the signal strength of the watermark signal based on 
ment map may define a pattern of watermark samples in the a HVS model analysis of the input image. The mask may 
ti-ansfotm domain in which the watermark is defined. An selectively increase or decrease the signal strength of the 
assignment map that maps a raw bit to a sample location or watermark signal in areas of greater signal activity, 
set of locations (i.e. a map to locations in a frequency or 25 The embedder combines (224) the watermark 
spatial domain) is just one special case of an assignment map information, the detection signal and the perceptual mask to 
for a transform domain. Fourth, the assignment map may yield the watermark signal 226. Finally, it combines (228) 
remain constant, or it may vary from one message to another. the input image 220 and the watermark signal 226 to create 
In addition, the carrier signal and map may vary depending the watermarked image 230. In the frequency domain water- 
on the nature of the underlying image. In sum, there many 30 mark example above, the embedder combines the transform 
possible design choices within the implementation frame- domain coefBcients in the watermark signal to the corre- 
work described above. sponding coefBcients in the input image to create a fre- 

The embedder depicted in FIG. 2 combines another quency domain representation of the watermarked image. It 

watermark component, shown as the detection watermark then transforms the image into the spatial domain. As an 

216, with the watermark information signal to compute the 35 alternative, the embedder may be designed to convert the 

final watermark signal. The detection watermark is specifi- watermark into the spatial domain, and then add it to the 

cally chosen to assist in identifying the watermark and image. 

computing its orientation in a detection operation. In the spatial watermark example above, the embedder 

FIG. 3 is a spatial frequency plot illustrating one quadrant combines the image samples in the watermark signal to the 

of a detection watermark. The points in the plot represent 40 corresponding samples in the input image to create the 

impulse fimctions indicating signal content of the detection watermarked image 230. 

watermark signal. The pattern of impulse functions for the The embedder may employ an invertible or non- 
illustrated quadrant is repUcated in all four quadrants. There invertible, and hnear or non-linear function to combine Uie 
are a number of properties of the detection pattern that watermark signal and the input image (e.g., linear functions 
impact its effectiveness for a particular application. The 45 such as S*-S+gX; or S*-S(l+gX), convolution, quantiza- 
selection of these properties is highly dependent on the tion index modulation). The net effect is that some image 
application. One property is the extent to which the pattern samples in the input image are adjusted upward, while others 
is symmetric about one or more axes. For example, if the are adjusted downward. The extent of the adjustment is 
detection pattern is symmetrical about the horizontal and greater in areas or subbands of the image having greater 
vertical axes, it is referred to as being quad symmetric. If it 50 signal activity. 

is further symmetrical about diagonal axes at an angle of 45 2.2. Overview of a Detector and Reader 

degrees, it is referred to as being octally symmetric (repeated FIG. 4 is a flow diagram illustrating an overview of a 

in a symmetric pattern 8 times about the origin). Such watermark detection process. This process analyzes image 

symmetry aids in identifying the watermark in an image, and data 400 to search for an orientation pattern of a watermark 

aids in extracting the rotation angle. However, in the case of 55 in an image suspected of containing the watermark (the 

an octally symmetric pattern, the detector includes an addi- target image). First, the detector transforms the image data 

tional step of testing which of the four quadrants the to another domain 402, namely the spatial frequency 

orientation angle falls into. domain, and then performs a series of correlation or other 

Another criterion is the position of the impulse functions detection operations 404. The correlation operations match 

and the frequency range that they reside in. Preferably, the 60 the orientation pattem with the target image data to detect 

impulse functions fall in a mid frequency range. If they are the presence of the watermark and its orientation parameters 

located in a low frequency range, they may be noticeable in 406 (e.g., translation, scale, rotation, and differential scale 

the watermarked image. If they are located in the high relative to its original orientation). Finally, it re-orients the 

frequency range, they are more difBcult to recover. Also, image data based on one or more of the orientation param- 

they should be selected so that scaling, rotation, and other 65 eters 408. 

manipulations of the watermarked signal do not push the If the orientation of the watermark is recovered, the reader 

impulse functions outside the range of the detector. Finally, extracts the watermark information signal from the image 
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data (optionally by first re-orienting the data based on the referred to as a Fourier Mellin transform. The Fourier Mellin 

orientation parameters). FIG. 5 is flow diagram illustrating transform is a geometric transform that warps the image data 

a process of extracting a message from re-oriented image from a frequency domain to a log polar coordinate system, 

data 500. The reader scans the image samples (e.g., pixels or As depicted in the plot 606 shown in FIG. 6, this transform 

transform domain coefScients) of the re-oriented image 5 sweeps through the transfonned image data along a Une at 

(502), and uses known attributes of the watermark signal to angle 9, mapping the data to a log polar coordinate system 

estmiate watermark signal values 504. Recall that in one shown in the next plot 608. The log polar coordinate system 

example implementation described above, the embedder has a rotation axis, representing the angle 6, and a scale axis, 

adjusted sample values (e.g., frequency coefBcients, color Inspecting the transfonned daU at this stage, one can see the 

values, etc.) up or down to embed a watermark information lo orientation pattern of the watermark begin to be distinguish- 

signal. The reader uses this attribute of the watermark able from the noise component (i.e., the image signal), 

information signal to estimate its value from the target Next, the detector performs a correlation 610 between the 

unage. Pnor to making these estimates, the reader may fflter transformed image block and the transformed orientation 

the image to remove portions of the image signal that may pattern 612. At a high level, the correlation process sUdes the 

mterfere with the estimating process. For example, if the is orientation pattern over the transformed image (in a selected 

watermark signal is expected to reside in low or medium transform domain, such as a spatial frequency domain) and 

frequency bands, then high frequencies may be filtered out. measures the correlation at an array of discrete positions. 

In addition, the reader may predict the value of the Each such position has a corresponding scale and rotation 

onginal un-watermarked image to enhance message recov- parameter associated with it. Ideally, there is a position that 

ery. One form of prediction uses temporal or spatial neigh- 20 clearly has the highest correlation relative to all of the 

bois to estimate a sample value in the original image. In the others. In practice, there may be several candidates with a 

frequency domain, frequency coefficients of the original promising measure of correlation. As explained further 

signal can be predicted from neighboring frequency coefE- below, these candidates may be subjected to one or more 

cients in the same frequency subband. In video applications additional correlation stages to select the one that provides 

for example, a frequency coefficient in a frame can be 25 the best mateh. 

predicted from spatiaUy neighboring coefficients within the There are a variety of ways to implement the correlation 

same frame, or temporally neighboring coefficients in adja- process. Any number of generalized matohing filters may be 

cent frames or fields. In the spatial domain, intensity values implemented for this purpose. One such filter performs an 

of a pixel can be estimated from intensity values of neigh- FFT on the target and the orientation pattern, and multiplies 

bonng pixels. Having predicted the value of a signal in the 30 the resulting arrays together to yield a multiplied FFT. 

original, un-watermarked image, the reader then estimates Finally, it performs an inverse FFT on the multiplied FFT to 

the watermark signal by calculating an inverse of the water- return the data into its original log-polar domain. The 

marking function used to combine the watermark signal with position or positions within this resulting array with the 

the original signal. highest magnitude represent the candidates with the highest 

For such watermark signal estimates, the reader uses the 35 correlation, 

assignment map to find the corresponding raw bit position When there are several viable candidates, the detector can 

and image sample in the carrier signal (506). The value of select a set of the top candidates and apply an additional 

the raw bit is a function of the watermark signal estimate, correlation stage. Each candidate has a corresponding rota- 

and the carrier signal at the corresponding location in the tion and scale parameter. Hie correlation stage rotates and 

earner. To estimate the raw bit value, the reader solves for 40 scales the FFT of the orientation pattern and performs a 

Its value based on the carrier signal and the watermark signal matching operation with the rotated and scaled pattern on 

estunate. As reflected generally in FIG. 5 (508), the result of the FFT of the target image. The matching operation mul- 

this computation represents only one estimate to be analyzed tiplies the values of the transformed pattern with sample 

along with other estimates impacting the value of the values at corresponding positions in the target image and 

corresponding raw bit. Some estimates may indicate that the 45 accumulates the result to yield a measure of the correlation, 

raw bit is likely to be a one, while others may indicate that The detector repeats this process for each of the candidates 

It IS a zero. After the reader completes its scan, it compiles and picks the one with the highest measure of correlation. As 

the estimates for each bit position in the raw bit string, and shown in FIG. 6, the rotation and scale parameters (614) of 

makes a determination of the value of each bit at that the selected candidate are then used to find additional 

position (510). Finally, it performs the inverse of the error so parameters that describe the orientation of the watermark in 

correction coding scheme to construct the message (512). In the target image. 

some implementations, probablistic models may be The detector applies the scale and rotation to the target 
employed to determine the likelihood that a particular pat- data block 616 and then performs another correlation pro- 
tern of raw bits is just a random occurrence rather than a cess between the orientation pattern 618 and the scaled and 
watermark. 55 rotated data block 616. The correlation process 620 is a 
2.2.1 Example Illustrating Detector Process generaUzed matching filter operation. It provides a measure 

FIG. 6 is a diagram depicting an example of a watermark of correlation for an array of positions that each has an 

detection process. The detector segments the target image associated translation parameter (e.g., an x, y position), 

into blocks (e.g., 600, 602) and then performs a Again, the detector may repeat the process of identifying 

2-dmiensional fast fourier tiansform (2D FFT) on several 60 promising candidates (i.e. those that reflect better correlation 

blocks. This process yields 2D transforms of the magnitudes relative to others) and using those in an additional search for 

of the image contents of the blocks in the spatial frequency a parameter or set of orientation parameters that provide a 

domain as depicted in the plot 604 shown in FIG. 6. better measure of correlation. 

Next, the detector process performs a log polar remapping At this point, the detector has recovered the following 

of each transformed block. The detector may add some of 65 orientation parameters: rotation, scale and translation. For 

the blocks together to increase the watermark signal to noise many appUcations, these parameters may be sufficient to 

ratio. The type of remapping in this implementation is enable accurate reading of the watermark. In the read 
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operation, the reader applies the orientation parameters to 3.0 Embedder Implemi _ 

re-orient the target image and then proceeds to extract the x^e foUowing sections describe an implementation of the 

watermarK signal. digital image watermark embedder depicted in HG. 8. The 

In some appbcations, ttte waterma±ed image may be etSbedder kiserts two watermark coi^onents into the host 

stretched more in one spaUal dimension than another. This 5 ^ „ component and a detection component 
type of distortion « f omeUmes referred to as differential (^.u^j orientation pattern). ITie message component is 

scale or shear. Consider that the ongmal image blocks are defined in a spatial domain or other transforS. dom^n, while 

square^ As a result of differenUa^ scale, each square may be ^he orientation pattern is defined in a frequency domain. As 

warped into a parallelogram with unequal sides^ Differential 1^^^^^ j^, ''t^e message component ^rves a dual func- 

scale parameters define the nature and extent of this stretch- of conveying a message and helping to identify the 

, , • , watermark location in the image. 

There are several altemaUve ways to recover the differ- tt. u jj - .1. . 1 j ■ 

ential scale parameters. One general class of techniques is to , J^^ embedder inserts the watermark message and onen- 

use the known parameter^ (e.g., the computed scale, P^"«™ ^ "ocks of a selected color plane or planes 

rotation, and traiilation) as a starting point to find the fe^ ' °' chrominance plane) of the host image, 

differential scale parameters. Assuming the known param- « -"^^^ P^yload vanes from one apphcation to another, 

eters to be valid, this approach warps fither the orientation '^^^ nTn ' '"t' kT^'^th kTT 

pattern or the target image with selected amounts of differ- ^^Pl'^* f domain m which it is embedded. The blocks 

ential scale and^icks th! differential scale parameters that ^n^'J ™ H "^^"^ '° ' "^"''"^ °' 

yield the best co^elation. ^Tv a ^x. 

Another approach to determination ofdifferentialscde is 20 Vr iS^i^^ ^^^^- u . • . 

set forth in arolication Ser. No. 09/452,022 (filed Nov. 30, „f Hn,™ r^w^J.TTTh-H"'^^^^ • T ' Tf 

innn j .-.i j.x.uj jo^ r r^ ■• ot binary raw bits that it hides in the host unage. As part of 

1999, and enUUed Method and System for Determmmg ^ „4 , ^ .^^^er 800 appends «rtain known 

Image Transformauon. attorney docket 60057). bits to the message bits 802. It perfoi^an error detection 

2.2^ Example lUustrating Reader Process p^cess (e.g., parity, CycUc Redundancy Check (CRC), etc.) 

HG. 7 IS a diagram illustratuig a re-oriented image 700 25 to generate error detection bits and adds the error detection 

superimposed onto the original watermarked image 702. bits to the message. An error correction coding operation 

The difference in orientation and scale shows how the image then generates raw bits from the combined known and 

was transformed and edited after the embedding process. message bit string. 

The original watermarked image is sub divided into tiles For the error correction operation, the embedder may 

(e.g., pixel blocks 704, 706, etc.). When superimposed on 30 employ any of a variety of error correction codes such as 

the coordinate system of the original image 702 shown in Reed Solomon, BCH, convolution or turbo codes. The 

FIG. 7, the target image blocks typically do not match the encoder may perform an M-ary modulation process on the 

orientation of the original blocks. message bits that maps groups of message bits to a message 

The reader scans samples of the re-oriented image data, signal based on an M-ary symbol alphabet, 
estimating the watennark infonnation signal. It estimates the 35 In one application of the embedder, the component of the 
watermark information signal, in part, by predicting original message representing the known bits is encoded more redun- 
sample values of the im-watermarked image. The reader dantly than the other message bits. This is an example of a 
then uses an inverted form of the watermarking function to shorter message component having greater signal strength 
estimate the watermark information signal from the water- than a longer, weaker message component. The embedder 
marked signal and the predicted signal. This inverted water- 40 gives priority to the known bits in this scheme because the 
marking function expresses the estimate of the watermark reader uses them to verify that it has found the watermark in 
signal as a function of the predicted signal and the water- a potentially corrupted image, rather than a signal masquer- 
marked signal. Having an estimate of the watermark signal, ading as the watermark, 
it then uses the known relationship among the carrier signal, 3.2 Spread Spectrum Modulation 

the watermark signal, and the raw bit to compute an estimate 45 The embedder uses spread q)ectrum modulation as part of 

of the raw bit. Recall that samples in the watermark infor- the process of creating a watermark signal from the raw bits, 

mation signal are a function of the carrier signal and the raw A spread spectrum modulator 804 spreads each raw bit into 

bit value. Thus, the reader may invert this function to solve a number of "chips." The embedder generates a pseudo 

for an estimate of the raw bit value. random number that acts as the carrier signal of the message. 

Recall that the embedder implementation discussed in so To spread each raw bit, the modulator performs an exclusive 

connection with FIG. 2 redundandy encodes the watermark OR (XOR) operation between the raw bit and each bit of a 

information signal in blocks of the input signal. Each raw bit pseudo random binary number of a pie-determined length, 

may map to several samples within a block. In addition, the The length of the pseudo random number depends, in part, 

embedder repeats a mapping process for each of the blocks. on the size of the message and the image. Preferably, the 

As such, the reader generates several estimates of the raw bit 55 pseudo random number should contain roughly the same 

value as it scans the watermarked image. number of zeros and ones, so that the net effect of the raw 

The information encoded in the raw bit string can be used bit on the host image block is zero. If a bit value in the 

to increase the accuracy of read operations. For instance, pseudo random number is a one, the value of the raw bit is 

some of the raw bits act as signature bits that perform a mverted. Conversely, if the bit value is a zero, then the value 

validity checking function. Unlike unknown message bits, 60 of the raw bit remains the same. 

the reader knows the expected values of these signature bits. The length of the pseudorandom number may vary from 

The reader can assess the validity of a read operation based one message bit or symbol to another. By varying the length 

on the extent to which the extracted signature bit values of the number, some message bits can be spread more than 

match the expected signature bit values. The estimates for a others. 

given raw bit value can then be given a higher weight 65 3.3 Scattering the Watermark Message 

depending on whether they are derived from a tile with a The embedder scatters each of the chips corresponding to 

greater measure of validity. a raw bit throughout an image block. An assignment map 
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806 assigns locations in the block to the chips of each raw this task. These filters include an edge detector filter that 

bit. Each raw bit is spread over several chips. As noted identifies edges of objects in the image, a non-linear filter to 

above, an image block may represent a block of transform map gain values into a desired range, and averaging or 

domain coefiScients or samples in a spatial domain. The median filters to smooth the gain values. Each of these filters 

assignment map may be used to encode some message bits 5 may be implemented as a series of one-dimensional filters 

or symbols (e.g., groups of bits) more redundantly than (one operating on rows and the other on columns) or 

others by mapping selected bits to more locations in the host two-dimensional filters. The size of the filters (i.e. the 

signal than other message bits. In addition, it may be used to munber of samples processed to compute a value for a given 

map different messages, or different components of the same location) may vary (e.g., 3 by 3, 5 by 5, etc.). The shape of 

message, to different locations in the host signal. lo the filters may vary as well (e.g., square, cross-shaped, etc.). 

FIG. 9 depicts an example of the assignment map. Each The perceptual analyzer process produces a detailed gain 

of the blocks in FIG. 9 correspond to an image block and multipher. The multiplier is a vector with elements corre- 

depict a pattern of chips corresponding to a single raw bit. sponding to samples in a block. 

FIG. 9 depicts a total of 32 example blocks. The pattern Another component 818 of the gain controller computes 

within a block is represented as black dots on a white 15 an asymmetric gain based on the output of the image sample 

background. Each of the patterns is mutually exclusive such values and message watermark signal. This component 

that each raw bit maps to a pattern of unique locations analyzes the samples of the block to determine whether they 

relative to the patterns of every other raw bit. Though not a are consistent with the message signal. The embedder 

requirement, the combined patterns, when overlapped, cover reduces the gain for samples whose values relative to 

every location within the image block. 20 neighboring values are consistent with the message signal. 

3.4 Gain Control and Perceptual Analysis The embedder applies the asymmetric gain to increase the 

To insert the information carried in a chip to the host chances of an accurate read in the watermark reader. To 

image, the embedder alters the corresponding sample value understand the effect of the asymmetric gain, it is helpful to 

in the host image. In particular, for a chip having a value of explain the operation of the reader. The reader extracts the 

one, it adds to the corresponding sample value, and for a 25 watermark message signal &om the watermarked signal 

chip having a value of zero, it subtracts from the corre- using a predicted version of the original signal. It estimates 

sponding sample value. A gain controller in the embedder the watermark message signal value based on values of the 

adjusts the extent to which each chip adds or subtracts from predicted signal and the watermarked signal at locations of 

the corresponding sample value. the watermarked signal suspected of containing a watermark 

The gain controller takes into account the orientation 30 signal. There are several ways to predict the original signal, 

pattern when determining the gain. It applies a different gain One way is to compute a local average of samples around the 

to the orientation pattern than to the message component of sample of interest. The average may be computed by takii^ 

the watermark. After applying the gain, the embedder com- the average of vertically adjacent samples, horizontally 

bines the orientation pattern and message components adjacent samples, an average of samples in a cross-shaped 

together to form the composite watermark signal, and com- 35 filter (both vertical and horizontal ne^hbors, an average of 

bines the composite watermark with the image block. One samples in a square-shaped filter, etc. The estimate may be 

way to combine these signal components is to add them, but computed one time based on a single predicted value from 

other linear or non-linear functions may be used as well. one of these averaging computations. Alternatively, several 



n pattern is comprised of a pattern of quad estimates may be computed based on two or more of these 

symmetric impulse functions in the spatial frequency 40 averaging computations (e.g., one estimate for vertically 

domain. In the spatial domain, these impulse functions look adjacent samples and another for horizontally adjacent 

hke cosine waves. An example of the orientation pattem is samples). In the latter case, the reader may keep estimates if 

depicted in FIGS. 10 and U. FIG. 10 shows the impulse they satisfy a similarity metric. In other words, the estimates 

functions as points in the spatial frequency domain, while are deemed vahd if they within a predetermined value or 

FIG. 11 ^ows the orientation pattern in the spatial domain. 45 have the same polarity. 

Before adding the orientation pattem, component to the Knowing this behavior of the reader, the embedder com- 

message component, the embedder may transform the putes the asymmetric gain as follows. For samples that have 

watermark components to a common domain. For example, values relative to their neighbors that are consistent with the 

ff the message component is m a spatial domain and the watermark signal, the embedder reduces the asymmetric 

orientation component is in a frequency domain, the embed- 50 gain. Conversely, for samples that are inconsistent with the 

der transforms the orientation component to a common watermark signal, the embedder increases the asymmetric 

spatial domain before combining them together. gain. For example, ff the chip value is a one, then the sample 

FIG. 8 depicts the gain controller used in the embedder. is consistent with the watermark signal if its value is greater 

Note that the gain controller operates on the blocks of image than its neighbors. Alternatively, if the chip value is a zero, 

samples 808, the message watermark signal, and a global 55 then the sample is consistent with the watermark signal if its 

gain input 810, which may be specified by the user. A value is less than its neighbors. 

perceptual analyzer component 812 of the gain controller Another component 820 of the gain controller computes 

performs a perceptual analysis on the block to identity a differential gain, which represents an adjustment in the 

samples that can tolerate a stronger watermark signal with- message vs. orientation pattern gains. As the global gain 

out substantially impacting visibiUty. In places where the 60 increases, the embedder emphasizes the message gain over 

naked eye is less likely to notice the watermark, the per- the orientation pattem gain by adjusting the global gain by 

ceptual analyzer increases the strength of the watermark. an adjustment factor. The inputs to this process 820 include 

Conversely, it decreases the watermark strength where the the global gain 810 and a message differential gain 822. 

eye is more likely to notice the watermark. When the global gain is below a lower threshold, the 

The perceptual analyzer shown in FIG. 8 performs a series 65 adjustment factor is one. When the global gain is above an 

of filtering operations on the image block to compute an upper threshold, the adjustment factor is set to an upper limit 

array of gain values. There are a variety of filters suitable for greater than one. For global gains falling within the two 



us 6,614,914 Bl 

19 20 

thresholds, the adjustment factor increases linearly between For some applications, the detector will operate in a 
one and the upper limit. The message differential gain is the system that provides multiple image frames of a water- 
product of the adjustment factor and the global gain. marked object. One typical example of such a system is a 
At this point, there are four sources of gain: the detailed computer equipped with a digital camera. In such a 
gain, the global gain, the asymmetric gain, and the message s configuration, the digital camera can capture a temporal 
dependent gain. The embedder applies the first two gain sequence of images as the user or some device presents the 
quantities to both the message and orientation watermark watermarked image to the camera. 

signals. It only applies the latter two to the message water- As shown in FIG. 12, the principal components of the 

mark signal. FIG. 8 depicts how the embedder applies the detector are: 1) pre-processor 900; 2) rotation and scale 

gain to the two watermark components. First, it multiplies lo estimator 902; 3) orientation parameter refiner 904; 4) 

the deuiled gain with the global gain to compute the translation estimator 906; 5) translation refiner 908; and 

orientation pattern gain. It then multiplies the orientation reader 910. 

pattern gain with the adjusted message differential gain and The preprocessor 900 takes one or more frames of image 

asymmetric gain to form the composite message gain. data 912 and produces a set of image blocks 914 prepared 

Finally, the embedder forms the composite watermark is for further analysis. The rotation-scale estimator 902 com- 

signal. It multiplies the composite message gain with the putes rotation-scale vectors 916 that estimate the orientation 

message signal, and multiplies the orientation pattern gain of the orientation signal in the image blocks. The parameter 

with the orientation pattern signal. It then combines the refiner 904 collects additional evidence of the orientation 

result in a common transform domain to get the composite signal and further refines the rotation scale vector candidates 



watermark. The embedder applies a watermarking function 20 by estimating differential scale parameters. The result of this 

to combine the composite watermark to the block to create refining stage is a set of 4D vectors candidates 918 (rotation, 

a watermarked image block. The message and orientation scale, and two differential scale parameters). The translation 

componentsof the watermark may be combir«d by mapping estimator 906 uses the 4D vector candidates to re-orient 

the message bits to samples of the orientation signal, and image blocks with promising evidence of the orientation 

modulating the samples of the orientation signal to encode 25 signal. It then finds estimates of translation parameters 920. 

the message. The translation refiner 908 invokes the reader 910 to assess 

The embedder computes the watermark message signal by the merits of an orientation vector. When invoked by the 

converting the output of the assignment map 806 to delta detector, the reader uses the orientation vector to approxi- 

values, indicating the extent to which the watermark signal mate the original orientation of the host image and then 

changes the host signal. As noted above, a chip value of one 30 extracts values for the known bits in the watermark message, 

corresponds to an upward adjustment of the corresponding The detector uses this information to assess the merits of and 

sample, while a chip value of zero corresponds to a down- refine orientation vector candidates, 

ward adjustment. The embedder specifies the specific By comparing the extracted values of the known bits with 

amount of adjustment by assigning a delta vahie to each of the expected values, the reader provides a figure of merit for 

the watermark message samples (830). 35 an orientation vector candidate. The translation refiner then 

4.0 Detector Implementation P'""^ * including rotation, scale, differential scale 

. ^ , , and translation, that appears likely produce a vaMd read of 

»^ .i^; TT"^ '''^™^*''*'"''°' watermark message 922. The following sections 

that detects the presence of a detecUon watermark m a host describe implementations of these components in more 

unage and its onentation. Using the orientation pattern and 40 detail 

ttie known bits inserted in the watermark message, the 4.1 Detector Pre-processing 

detector determmes whether a potentially corrupted image piG. 13 is a flow diagram illustrating preprocessing 

contains a watermark, and if so, its orientation in the image. operations in the detector shown in FIG. 12. The detector 

Recall that the composite watermark is replicated in performs a series of pre-processing operations on the native 

blocks of the onginal image. After an embedder places the 45 image 930 to prepare the image data for further analysis. It 

watermark in the original digital image, the watermarked begins by filling memory with one or more frames of native 

image is likely to undergo several transformations, either image data (932), and seleaing sets of pixel blocks 934 from 

from routme processmg or from mtentional tampering. the native image data for further analysis (93^. While the 

Some of these transformations include: compression, detector can detect a watermark using a single image frame 

decompression, color space conversion, digital to analog 50 it also has support for detecting the watermark using addi- 

conyersion, pnntmg, scanning, analog to digital conversion, tional image frames. As explained below, the use of multiple 

scahng, rotation, inversion, flipping differential scale, and frames has the potential for increasing the chances of an 

lens distortion. In addition to these transformations, various accurate detection and read. 

noise sources can corrupt the watermark signal, such as fixed in appUcations where a camera captures an input image of 

pattern noise, thermal noise, etc. 55 a watermarked object, the detector may be optimized to 

When building a detector implementation for a particular address problems resulting from movement of the object, 

application, the developer may implement counter-measures Typical PC cameras, for example, are capable of capturing 

to mitigate the impact of the types of transformations, images at a rate of at least 10 frames a second. A frustrated 

distortions and noise expected for that application. Some user might attempt to move the object in an attempt to 

applications may require more counter-measures than oth- 60 improve detection. Rather than improving the chances of 

ers. The detector described below is designed to recover a detection, the movement of the object changes the orienta- 

watermark from a watermarked image after the image has tion of the watermark from one frame to the next, potentiaUy 

been printed, and scanned. The following sections describe making the watermark more difficult to detect. One way to 

the counter-measures to mitigate the impact of various forms address this problem is to buffer one or more frames, and 

of corruption. The developer can select from among these 65 then screen the frame or frames to determine if they are 

t»unter-measures when implementing a detector for a par- likely to contain a vaUd watermark signal. If such screening 

ticular application. indicates that a frame is not likely to contain a valid signal. 
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the detector can discard it and proceed to the next frame in At one or more stages of the detector, it may be useful to 

the buffer, or buffer a new frame. Another enhancement is to perform operations to mitigate the impact of noise and 

isolate portions of a frame that are most likely to have a valid distortion. In the pre-processing phase, for example, it may 

watermark signal, and then perform more detailed detection be useful to evaluate fixed pattern noise and mitigate its 

of the isolated portions. 5 effect (938). The detector may look for fixed pattern noise in 

After loading the image into the memory, the detector the native input data or the luminance data, and then mitigate 

selects image blocks 934 for further analysis. It is not i^- 

necessary to load or examine each block in a frame because '° mitigate certain types of noise is to combine 

it is possible to extract the watermark using only a portion ^™ different blocks in the same frame, or correspond- 

of an image. The detector looks at only a subset of the lo *° different frames 940. This process helps 

samples in an image, and preferably analyzes samples that *T°"' * watermark signal present in ^the blocks, while 

are more likely to have a recoverable watermark signal. reducmg tiie noise common to the blocks. For example, 

He detector identifies portions of the image ±at are """^'y '^'^'"8 ""^'^ '"^ether may mitigate the effects of 

likely to have the highest watermark signal to noise ratio. It TnTdit^rto common noise, other forms of noise may 

,U 77° l^^^ ''^f'f " ^PP«" ^ ^'^'^^ °f blocks such as noise introduced in the 

portions. In the context of watermark detection, the host printing or scanning processes. Depending on the nature of 

unage is considered to be a source of noise along with the application, it may be advantageous to perform common 

conventional noise sources. While it is typically not practical noise recognition and removal at this stage 942. The devel- 

to compute the signal to noise ratio, the detector can evaluate oper may select a filter or series of filters to target certain 

attributes of the signal that are likely to evince a promising 20 types of noise that appear during experimentation with 

watermark signal to noise ratio. These properties include the images. Certain types of median filters may be effective in 

signal activity (as measured by sample variance, for mitigating the impact of spectral peaks (e.g., speckles) 

example), and a measure of the edges (abrupt changes in introduced in printing or scanning operations, 

image sample values) in an image block. Preferably, the In addition to introducing noise, the printing and image 

signal activity of a candidate block should fall within an 25 capture processes may transform the color or orientation of 

acceptable range, and the block should not have a high original, watermarked image. As described above, the 

concentration of strong edges. One way to quantify the embedder typically operates on a digital image in a particu- 

edges in the block is to use an edge detection filter (e.g., a ^^"^ ^P*'* * desired resolution. The watermark 

LaPlaciao, Sobel, etc.). ' embedders normally operate on digital images represented 

In one 'implementation, the detector divides the input 30 !° ^".^'^ °' ""^^^ *P^f ^ desired resolution 

image into blocks, and analyzes each block based on pre- i^'^ '. ^ ^P'' resolution at which the unage 

determined metrics. It then ranks the blocks according to P™ited). The miages are then prmtedon paper with a 

n,.« rr,^t^^ XI,- A^,. , ,u . TC I, , ■ screen prmtmg process that uses the CYMK subtiactive 

tiiesemelncs. Tie detector then operates on the blocks m the 3^,^ ^^Ylmc per inch (LPI) ranging from 65-200. 

order of the rankmg. The metncs mclude sample variance in 133 typical for quality magaz^s^and 73 Unes/in 

a candidate block and a measure of the edges m the block. 35 is typical for ne\^papers In order to produce a quality 

TTie detector combmes these metncs for each candidate image and avoid pixelization, the rule of thumb is to use 

block to compute a rank representing the probabiUty that it digital images with a resolution that is at least twice the press 

contains a recoverable watermark signal. resolution. This is due to the half tone priming for color 

In another implementation, the detector selects a pattern production. Also, different presses use screens with different 

of blocks and evaluates each one to ti-y to make the most 40 patterns and line orientations and have different precision for 

accurate read from the available data. In either color registration. 

implementation, the block pattern and size may vary. This One way to counteract the tt-ansforms inti-oduced through 

particular implementation selects a pattern of overlapping the printing process is to develop a model that characterizes 

blocks (e.g., a row of horizontally aligned, overlapping these transforms and optimize watermark embedding and 

blocks). One optimization of this approach is to adaptively 45 detecting based on this characterization. Such a model may 

select a block pattern that increases the signal to noise ratio be developed by passing watermarked and unwatermarked 

of the watermark signal. While shown as one of the initial images through the printing process and observing the 

operations in the preparation, the selection of blocks can be changes that occur to these images. The resulting model 

postponed until later in the pre-processing stage. characterizes the changes inti-oduced due to the printing 

Next, the detector performs a color space conversion on 50 process. The model may represent a transfer function that 

native image data to compute an array of image samples in approximates the tiansforms due to the printing process. The 

a selected color space for each block (936). In the following detector then implements a pre-processing stage that 

description, the color space is luminance, but the watermark reverses or at least mitigates the effect of the printing process 

may be encoded in one or more different color spaces. The on watermarked images. The detector may implement a 

objective is to get a block of image samples with lowest 55 pre-processing stage that performs the inverse of the transfer 

noise practical for the application. While the implementation function for the printing process. 

currentiy performs a row by row conversion of the native A related challenge is the variety in paper attiibutes used 

image data into 8 bit integer luminance values, it may be in different printing processes. Papers of various qualities, 

preferable to convert to floating-point values for some thickness and stiffness, absorb ink in various ways. Some 

applications. One optimization is to select a luminance 60 papers absorb ink evenly, while others absorb ink at rates 

converter that is adapted for the sensor used to capture the that vary with the changes in the paper's textiire and 

digital input image. For example, one might experimentally thickness. These variations may degrade the embedded 

derive the lowest noise luminance conversion for commer- watermark signal when a digitally watermarked image is 

cially available sensors, e.g., CCD cameras or scanners, printed. The watermark process can counteract these effects 

CMOS cameras, etc. Then, the detector could be pro- 65 by classifying and characterizing paper so that the embedder 

grammed to select either a default luminance converter, or and reader can compensate for this printing-related degra- 

one tuned to a ^ecific type of sensor. dation. 
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Variations in image capture processes also pose a chal- 
lenge. In some applications, it is necessary to address 
problems introduced due to interlaced image data. Some 
video camera produce interlaced fields representing the odd 
or even scan lines of a frame. Problems arise when the 5 
interlaced image data consists of fields from two consecutive 
frames. To construct an entire frame, the preprocessor may 
combine the fields from consecutive frames while deaUng 
with the distortion due to motion that occurs from one frame 
to the next. For example, it may be necessary to shift one lo 
field before interleaving it with another field to counteract 
inter-frame motion. A de-blurring function may be used to 
mitigate the blurring effect due to the motion between 

Another problem associated with cameras in some appli- 15 
cations is blurring due to the lack of focus. The preprocessor 
can mitigate this effect by estimating parameters of a blur- 
ring function and applying a de-blurring function to the 
input image. 

Yet another problem associated with cameras is that they 20 
tend to have color sensors that utilize different color pattern 
implementations. As such, a sensor may produce colors 
shghtly different than those represented in the object being 
captured. Most CCD and CMOS cameras use an array of 
sensors to produce colored images. The sensors in the array 25 
are arranged in clusters of sensitive to three primary colors 
red, green, and blue according to a specific pattern. Sensors 
designated for a particular color are dyed with that color to 
increase their sensitivity to the designated color. Many 
camera manufacturers use a Bayer color pattern GR/BG. 30 
While this pattern produces good image quality, it causes 
color mis-regisfration that degrades the watermark signal. 
Moreover, the color space converter, which maps the signal 
from the sensors to another color space such as YUV or 
RGB, may vary from one manufacturer to another. One way 35 
to counteract the mis-registration of the camera's color 
pattem is to account for the distortion due to the pattern in 
a color transformation process, implemented either within 
the camera itself, or as a pre-processing function in the 
detector. 40 

Another challenge in counteracting the effects of the 
image capture process is dealing with the different types of 
distortion introduced from various image capture devices. 
For example, cameras have different sensitivities to Ught. In 
addition, their lenses have different spherical distortion, and 45 
noise characteristics. Some scanners have poor color repro- 
duction or introduce distortion in the image aspea ratio. 
Some scanners introduce aUasing and employ interpolation 
to increase resolution. The detector can counteract these 
effects in the pre-processor by using an appropriate inverse 50 
transfer function. An off-line process first characterizes the 
distortion of several different image capture devices (e.g., by 
passing test images through the scanner and deriving a 
transfer function modeling the scanner distortion). Some 
detectors may be equipped with a hbrary of such inverse 55 
transfer functions from which they select one that corre- 
sponds to the particular image capture device 

Yet another challenge in applications where the image is 
printed on paper and later scanned is that the paper deterio- 
rates over time and degrades the watermark. Also, varying 60 
lighting conditions make the watermark difficult to detect. 
Thus, the watermark may be selected so as to be more 
impervious to expected deterioration, and recoverable over 
a wider range of lighting conditions. 

At the close of the pre-processing stage, the detector has 65 
selected a set of blocks for further processing. It then 
proceeds to gather evidence of the orientation signal in these 
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blocks, and estimate the orientation parameters of promising 
orientation signal candidates. Since the image may have 
suffered various forms of corruption, the detector may 
identify several parts of the image that appear to have 
attributes similar to the orientation signal. As such, the 
detector may have to resolve potentially conflicting and 
ambiguous evidence of the orientation signal. To address 
this challenge, the detector estimates orientation parameters, 
and then refines theses estimates to extract the orientation 
parameters that are more likely to evince a valid signal than 
other parameter candidates. 
4.2 Estimating Initial Orientation Parameters 

FIG. 14 is a flow diagram illustrating a process for 
estimating rotation-scale vectors. The detector loops over 
each image block (950), calculating rotation-scale vectors 
with the best detection values in each block. First, the 
detector filters the block in a manner that tends to amplify 
the orientation signal while suppressing noise, including 
noise from the host image itself (952). Implemented as a 
multi-axis LaPlacian filter, the filter highlights edges (e.g., 
high frequency components of the image) and then sup- 
presses them. The term, "multi-axis," means that the filter 
includes a series of stages that each operates on particular 
axis. First, the filter operates on the rows of Ituninance 
samples, then operates on the columns, and adds the results. 
The filter may be apphed along other axes as well. Each pass 
of the filter produces values at discrete levels. The final 
result is an array of samples, each having one of five values: 
{-2, -1, 0, 1, 2}. 

Next, the detector performs a windowing operation on the 
block data to prepare it for an FFT transform (954). This 
windowing operation provides signal continuity at the block 
edges. The detector then performs an FFT (956) on the 
block, and retains only the magnitude component (958). 

In an alternative implementation, the detector may use the 
phase signal produced by the FFT to estimate the translation 
parameter of the orientation signal. For example, the detec- 
tor could use the rotation and scale parameters extracted in 
the process described below, and then compute the phase 
that provided the highest measure of correlation with the 
orientation signal using the phase component of the FFT 
process. 

After computing the FFT, the detector appUes a Fourier 
magnitude filter (960) on the magnitude components. The 
filter in the implementation shdes over each sample in the 
Fourier magnimde array and filters the sample's eight neigh- 
bors in a square neighborhood centered at the sample. The 
filter boosts values representing a sharp peak with a rapid 
faU-off, and suppresses the faU-off portion. It also performs 
a threshold operation to clip peaks to an upper threshold. 

Next, the detector performs a log-polar re-sample (962) of 
the filtered Fourier magnitude array to produce a log-polar 
array 964. This type of operation is sometimes referred to as 
a Fourier Mellin transform. The detector, or some off-line 
pre-processor, performs a similar operation on the orienta- 
tion signal to map it to the log-polar coordinate system. 
Using matching filters, the detector implementation searches 
for a orientation signal in a specified window of the log- 
polar coordinate system. For example, consider that the 
log-polar coordinate system is a two dimensional space with 
the scale being the vertical axis and the angle being the 
horizontal axis. The window ranges from 0 to 90 degrees on 
the horizontal axis and from approximately 50 to 2400 dpi 
on the vertical axis. Note that the orientation pattem should 
be selected so that routine scaling does not push the orien- 
tation pattem out of this window. The orientation pattem can 
be designed to mitigate this problem, as noted above, and as 
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explained in co-pending patent application Ser. No. 60/136, The statistical analysis may also include a maximum 

572, filed May 28, 1999, by Ammon Gustafson, entitled likeUhood analysis. In such an analysis, an off-line detector 

Watermarking System With Improved Technique for Detect- generates detection value statistics for both marked and 

mg Scahng and Rotation^ filed May 28, 1999. unmarked images. Based on the probability distributions of 

TTie detector proceeds to correlate the orientation and the 5 niarked and unmarked images, it determiiis the likeUhood 

t'i^f '^''^TV^''''^-^.f°':?± that a given detection value f^r an input image originates 

f^g^ i^'rMV^r""^"''"^?^"'''^'^?':^'^^^ from a marked and unmarked image. ^ ^ ^ 

(9oo). Ine GMF performs an FFT on the onentation and ». .u j r .u 1 ^- ■ , 

target signal, m^tipUes the resulting Fourier domain ^ of these correlation stages, the dete-ctor has 

entities, to6 pcdorJan inverse FFI. if is process yields a * '.^"^'^ °^ rotation-scale vectors 990, each 

rectangular array of values in log-polar coordinates, each 7* * quantized measure of correlation associated with it. 

representing a measure of correlation and having a corre- '"^ detector could simply choose the rotation 

spending toUtion angle and scale vector. As an optimization, vectors with the highest rank and proceed to 

the detector may also perform the same correlation opera- compute other orientation parameters, such as differential 

tions for distorted versions (968, 970, 972) of the orientation translation. Instead, the detector gathers more 

signal to see if any of the distorted orientation patterns evidence to refine the rotation-scale vector estimates. FIG. 

results in a higher measure of correlation. For example, the 15 is a flow diagram illustrating a process for refining the 

detector may repeat the correlation operation with some orientation parameters using evidence of the orientation 

pre-determined amount of horizontal and vertical differen- signal collected from blocks in the current frame, 

tial distortion (970, 972). The result of this correlation Continuing in the current frame, the detector proceeds to 

process is an array of correlation values 974 specifying the 20 compare the rotation and scale parameters from different 

amount of correlation that each corresponding rotation-scale blocks (e.g., block 0, block 1, block 2; 1000, 1002, and 1004 

P'°'^'^^- ^. ^ in FIG. 15). In a process referred to as interblock coinci- 

pe detector processes this array to find the top M peaks dence matching 1006, it looks for similarities between 

fnitr f ^° rotation-scale parameters that yielded the highest correlation 

location more accurately, the detector uses interpolation to ^5 different blocks. To quantify this similarity, it computes 

provide the mter-sample location of each of the top peaks tu ~ . • j- . u ^"'"'t'"^^' " 

978. Tlie interpolator computes the 2D median of lh^ the geometric d^tance between each peak m one block with 

samples around a peak and provides the location oTthe pe^ 'Tk°, .'Tf ° ^' 5'° ^'^r'Tl^' 

center to an accuracy of 0 1 sample probability that peaks will fall withm this calculated dis- 

The detector proceeds to rank the top rotation-scale tance. There are a vanety of ways to calculate the probabil- 

vectors based on yet another correlation process 980. In ^° '° ™^ implementation, the detector computes the geo- 

particular, the detector performs a correlation between a distance between two peaks, computes the circular 

Fourier magnihide representation for each rotation-scale encompassing the two peaks (jt(geometric distance)^, 

vector candidate and a Fourier magnitude specification of ^""^ computes the ratio of this area to the total area of the 

the orientation signal 982. Each Fourier magnitude repre- block. Finally, it quantizes this probability measure for each 

sentation is scaled and rotated by an amount reflected by the 35 pair of peaks (1008) by computing the log (base 10) of the 

J- _ rotation-scale vector. This correlation opera- ratio of the total area over the area encompassing the tv 



tion sums a point-wise multiplication of the orientation peaks. At this point, the deteaor has calculated two detec- 

pattern impulse functions in the frequency domain with the tion values: quantized peak value, and the quantized dis- 

Fourier magnitude values of the image at corresponding tance metric. 

frequencies to compute a measure of correlation for each 40 The detector now forms multi-block grouping of rotation- 
peak 984. The detector then sorts correlation values for the scale vectors and computes a combined detection value for 
peaks (986). each grouping (1010). The detector groups vectors based on 
Finally, the detector computes a detection value for each their relative geometric proximity within their respective 
peak (988). It computes the detection value by quantizing blocks. It then computes the combined detection value by 
the correlation values. Specifically, it computes a ratio of the 45 combining the detection values of the vectors in the group 
peak's correlation value and tiie correlation value of the next (1012). One way to compute a combined detection value is 
largest peak. Alternatively, the detector may compute the to add the detection values or add a weighted combination 
ratio of the peak's correlation value and a sum or average of of them. 

the correlation values of the next n highest peaks, where n Having calculated the combined detection values, the 
is some predetermined number. Then, the detector maps this 50 detector sorts each grouping by its combined detection value 
ratio to a deteaion value based on a statistical analysis of (1014). This process produces a set of the top groupings of 
unmarked images. unrefined rotation-scale candidates, ranked by detection 
The statistical analysis plots a distribution of peak ratio value 1016. Next, the detector weeds out rotation-scale 
values found in unmarked images. The ratio values are vectors that are not promising by excluding those groupings 
mapped to a detection value based on the probability that the 55 whose combined detection values are below a threshold (the 
value came from an unmarked image. For example, 90% of "refine threshold" 1018). The detector then refines each 
the ratio values in unmarked images fall below a first individual rotation-scale vector candidate within the remain- 
threshold Tl, and thus, the detection value mapping for a ing groupings. 

ratio of Tl is set to 1. Similarly, 99% of the ratio values in The detector refines a rotation-scale vector by adjusting 
unmarked images fall below T2, and therefore, the detection 60 the vector and checking to see whetiier the adjustinent 
value is set to 2. 99.9% of the ratio values in unmarked results in a better correlation. As noted above, the detector 
images fall below T3, and the corre^onding detection value may simply pick the best rotation-scale vector based on the 
is set to 3. The threshold values, Tl, T2 and T3, may be evidence collected thus far, and refine only that vector. An 
determined by performing a statistical analysis of several alternative approach is to refine each of the top rotation- 
images. The mapping of ratios to detection values based on 65 scale vector candidates, and continue to gather evidence for 
the statistical distribution may be implemented in a look up each candidate. In this approach, the detector loops over 
'sWe. each vector candidate (1020), refining each one. 
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One approach of refinii^ the 
follows: 

fix the orientation signal impulse functions ("points") 

within a valid boundary (1022); 
pre-refine the rotation-scale vector (1024); 5 
find the major axis and re-fix the orientation points 

(1026); and 

refine each vector with the addition of a differential scale 
component (1028). 

In this approach, the detector pre-refines a rotation-scale lO 
vector by incrementally adjusting one of the parameters 
(scale, rotation angle), adjusting the orientation points, and 
then summing a point-wise multiplication of the orientation 
pattern and the image block in the Fourier magnitude 
domain. The refiner compares the resulting measure of 15 
correlation with previous measures and continues to adjust 
one of the parameters so long as the correlation increases. 
After refining the scale and rotation angle parameters, the 
refiner finds the major axis, and re-fixes the orientation 
points. It then repeats the refining process with the intro- 20 
duction of differential scale parameters. At the end of this 
process, the refiner has converted each scale-rotation can- 
didate to a refined 4D vector, including rotation, scale, and 
two differential scale parameters. 

At this stage, the detector can pick a 4D vector or set of 25 
4D vector and proceed to calculate the final remaining 
parameter, translation. Altematively, the detector can collect 
additional evidence about the merits of each 4D vector. 

One way to collect additional evidence about each 4D 
vector is to re-compute the detection value of each orienta- 30 
tion vector candidate (1030). For example, the detector may 
quantize the correlation value associated with each 4D 
vector as described above for the rotation-scale vector peaks 
(see item 988, FIG. 14 and accompanying text). Another 
way to collect additional evidence is to repeat the coind- 35 
dence matching process for the 4D vectors. For this coin- 
cidence matching process, the detector computes spatial 
domain vectors for each candidate (1032), determines the 
distance metric between candidates fi-om different blocks, 
and then groups candidates bom different blocks based on 40 
the distance metrics (1034). The detector then re-sorts the 
groups according to their combined detection values (1036) 
to produce a set of the top P groupings 1038 for the fi-ame. 

FIG. 16 is a flow diagram illustrating a method for 
aggregating evidence of the orientation signal from multiple 45 
frames. In applications with multiple frames, the detector 
coUeas the same information for orientation vectors of the 
selected blocks in each fi^ame (namely, the top P groupings 
of orientation vector candidates, e.g., 1050, 1052 and 1054). 
The detector then repeats coincidence matching between 50 
orientation vectors of different fi-ames (1056). In particular, 
in this inter-frame mode, the detector quantizes the distance 
metrics computed between orientation vectors from blocks 
in different frames (1058). It then finds inter-frame group- 
ings of orientation vectors (super-groups) using the same 55 
approach described above (1060), except that the orientation 
vectors are derived from blocks in different frames. After 
organizing orientation vectors into super-groups, the detec- 
tor computes a combined detection value for each super- 
group (1062) and sorts the super-groups by this detection 60 
value (1064). The detector then evaluates whether to pro- 
ceed to the next stage (1066), or repeat the above process of 
computing orientation vector candidates from another frame 
(1068). 

If the detection values of one or more super-groups 65 
exceed a threshold, then the detector proceeds to the next 
Stage. If not, the detector gathers evidence of the orientation 



signal from another frame and returns to the inter-frame 
coincidence matching process. Ultimately, when the detec- 
tor finds sufficient evidence to proceed to the next stage, it 
selects the super-group with the highest combined detection 
value (1070), and sorts the blocks based on their correspond- 
ing detection values (1072) to produce a ranked set of blocks 
for the next stage (1074). 
4.3 Estimating Translation Parameters 

FIG. 17 is a flow diagram illustrating a method for 
estimating translation parameters of the orientation signal, 
using information gathered from the previous stages. 

In this stage, the detector estimates translation parameters. 
These parameters indicate the starting point of a water- 
marked block in the spatial domain. The translation 
parameters, along with rotation, scale and differential scale, 
form a complete 6D orientation vector. The 6D vector 
enables the reader to extract luminance sample data in 
approximately the same orientation as in the original water- 
marked image. 

One approach is to use generalized match filtering to find 
the translation parameters that provide the best correlation. 
Another approach is to continue to cx)Uect evidence about 
the orientation vector candidates, and provide a more com- 
prehensive ranking of the orientation vectors based on all of 
the evidence gathered thus far. The foUowing paragraphs 
describe an example of this type of an approach. 

To extract translation parameters, the detector proceeds as 
follows. In the multi-frame case, the detector selects the 
frame that produced 4D orientation vectors with the highest 
detection values (1080). It then processes the blocks 1082 ia 
that frame in the order of their detection value. For each 
block (1084), it applies the 4D vector to the luminance data 
to generate rectified block data (1086). The detector then 
performs dual axis filtering (1088) and the window function 
(1090) on the data. Next, it performs an FFT (1092) on the 
image data to generate an array of Fourier data. To make 
correlation operations more efficient, the detector buffers the 
fourier values at the orientation points (1094). 

The detector applies a generalized match filter 1096 to 
correlate a phase specification of the orientation signal 
(1098) with the transformed block data. The result of this 
process is a 2D array of correlation values. The peaks in this 
array represent the translation parameters with the highest 
correlation. The detector selects the top peaks and then 
applies a median filter to determine the center of each of 
these peaks. The center of the peak has a corresponding 
correlation value and sub-pixel translation value. This pro- 
cess is one example of getting translation parameters by 
correlating the Fourier phase specification of the orientation 
signal and the image data. Other methods of phase locking 
the image data with a synchronization signal like the orien- 
tation signal may also be employed. 

Depending on the implementation, the detector may have 
to resolve additional ambiguities, such as rotation angle and 
flip ambiguity. The degree of ambiguity in the rotation angle 
depends on the nature of the orientation signal. If the 
orientation signal is octally symmetric (symmetric about 
horizontal, vertical and diagonal axes in the spatial fre- 
quency domain), then the detector has to check each quad- 
rant (0-90, 90-180, 180-270, and 270-360 degrees) to find 
out which one the rotation angle resides in. Similarly, if the 
orientation signal is quad symmetric, then the detector has to 
check two cases, 0-180 and 180-270. 

The flip ambiguity may exist in some applications where 
the watermarked image can be flipped. To check for rotation 
and flip ambiguities, the detector loops through each pos- 
sible case, and performs the correlation operation for each 
one (HOC). 
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At the conclusion of the correlation process, the detector between a known watermark signal attribute and a corre- 

has produced a set of the top translation parameters with sponding attribute in the signal sui^ected of having a 

associated correlation values for each block. To gather watermark. Another figure of merit is the strength of the 

additional evidence, the detector groups similar translation watermark signal (or one of its components) in the suspect 

parameters from different blocks (1102), calculates a group 5 signal. For example, a figure of merit may be based on a 

detection value for each set of translation parameters 1104, measure of the watermark message signal strength and/or 

and then ranks the top translation groups based on their orientation pattern signal strength in the signal, or in a part 

corresponding group detection values 1106. of the signal from which the detector extracts the orientation 

4.4 Refining Translation Parameters parameters. The detector may computes a figure of merit 

Having gathered translation parameter estimates, the based the strength ofthewatermarksignal in a sample block, 
detector proceeds to refine these estimates. FIG. 18 is a flow It may also compute a figure of merit based on the percent- 
diagram illustrating a process for refining orientation param- age agreement between 5ie known bits of the message and 
eters. At this stage, the detector process has gathered a set of the message bits extracted from the sample block, 
the top translation parameter candidates 1120 for a given When the figure of merit is computed based on a portion 
frame 1122. The translation parameters provide an estimate of the suspect signal, the detector and reader can use the 
of a refereiKe point that locates the watermark, including figure of merit to assess the acctu-acy of the watermark signal 
both the orientation and message components, in the image detected and read from that portion of the signal. This 
frame. In the implementation depicted here, the translation approach enables the detector to assess the merits of orien- 
paiameters are represented as horizontal and vertical offsets tation parameters and to rank them based on their figure of 
from a reference point in the image block from which they merit. In addition, the reader can weight estimates of water- 
were computed. 20 mark message values based on the figure of merit to recover 

Recall that the detector has grouped translation param- a message more reliably, 
eters from different blocks based on their geometric prox- The process of calculating a figure of merit depends on 

imity to each other. Each pair of translation parameters in a attributes the watermark signal and how the embedder 

group is associated with a block and a 4D vector (rotation, inserted it into the host signal. Consider an example where 

scale, and 2 differential scale parameters). As shown in FIG. 25 the watermark signal is added to the host signal. To calculate 

18, the detector can now proceed to loop through each group a figure of merit based on the strength of the orientation 

(1124), and through the blocks within each group (1126), to signal, the detector checks the value of each sample relative 

refine the orientation parameters associated with each mem- to its neighbors, and compares the result with the cdrre- 

ber of the groups. Alternatively, a simpler version of the spending sample in a spatial domain version of the orien- 

detector may evaluate only the group with the highest 30 tation signal. When a sample's value is greater than its 

detection value, or only selected blocks within that group. neighbors, then one would expect that the corresponding 

Regardless of the number of candidates to be evaluated, orientation signal sample to be positive. Conversely, when 
the process of refining a given orientation vector candidate the sample's value is less than its neighbors, then one would 
may be implemented in a similar fashion. In the refining expect that the corresponding orientation sample to be 
process, the detector uses a candidate orientation vector to 35 negative. By comparing a sample's polarity relative to its 
define a mesh of sample blocks for further analysis (1128). neighbors with the corresponding orientation sample's 
In one implementation, for example, the detector forms a polarity, the detector can assess the strength of the orienta- 
mesh of 32 by 32 sample blocks centered around a seed tion signal in the sample block. In one implementation, the 
block whose upper right corner is located at the vertical and detector makes this polarity comparison twice for each 
horizontal of&et specified by the candidate translation 40 sample in an N by N block (e.g., N-32, 64, etc): once 
parameters. The detector reads samples from each block comparing each sample with its horizontally adjacent neigh- 
using the orientation vector to extract luminance samples bors and then again comparing each sample with its verti- 
that approximate the original orientation of the host image at cally adjacent neighbors. The detector performs this analysis 
encoding time. on samples in the mesh block after re-orienting the data to 

The detector steps through each block of samples (1130). 45 approximate the original orientation of the host image at 

For each block, it sets the orientation vector (1132), and then encoding time. The result of this process is a mmiber 

uses the orientation vector to check the validity of the reflecting the portion of the total polarity comparisons that 

watermark signal in the sample block. It assesses the validity yield a match. 

of the watermark signal by calculating a figure of merit for To calculate a figure of merit based on known signature 

the block (1134). To further refine the orientation parameters 50 bits in a message, the detector invokes the reader on the 

associated with each sample block, the detector adjusts sample block, and provides the orientation vector to enable 

selected parameters (e.g., vertical and horizontal translation) the reader to extract coded message bits from the sample 

and re-calculates the figure of merit. As depicted in the inner block. The detector compares the extracted message bits 

loop in FIG. 18 (block 1136 to 1132), the detector repeatedly with the known bits to determine the extent to which they 

adjusts the orientation vector and calculates the figure of 55 match. The result of this process is a percentage agreement 

merit in an attempt to find a refined orientation that yields a number reflecting the portion of the extracted message bits 

higher figure of merit. that match the known bits. Together the test for the orien- 

The loop (1136) may be implemented by stepping through tation signal and the message signal provide a figure of merit 

a predetermined sequence of adjustments to parameters of for the block. 

s (e.g., adding or subtracting smaU 60 As depicted in the loop from blocks 1138 to 1130, the 



5 from the horizontal and vertical translation detector may repeat the process of refining the o 

parameters). In this approach, the detector exits the loop vector for each sample block around the seed block. In this 

after stepping through the sequence of adjustments. Upon case, the detector exits the loop (1138) after analyzing each 

exiting, the detector retains the orientation vector with the of the sample blocks in the mesh defined previously (1128). 
highest figure of merit. 65 In addition, the detector may repeat the analysis in the loop 

There are a number of ways to calculate this figure of through all blocks in a given group (1140), and in the loop 

merit. One figure of merit is the degree of correlation through each group (1142). 
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After completing the analysis of the orientation vector 
candidates, the detector proceeds to compute a combined 
detection value for the various candidates by compiling the 
results of the figure of merit calculations. It then proceeds to 
invoke the reader on the orientation vector candidates in the 
order of their detection values. 
4.5 Reading the watermark 

FIG. 19 is a flow diagram illustrating a process for reading 
the watermark message. Given an orientation vector and the 
corresponding image data, the reader extracts the raw bits of 
a message from the image. The reader may acctmiulate 
evidence of the raw bit values from several different blocks. 
For example, in the process depicted in FIG. 19, the reader 
uses refined orientation vectors for each blodi, and accu- 
mulates evidence of the raw bit values extracted from the 
blocks associated with the refined orientation vectors. 

The reading process begins with a set of promising 
orientation vector candidates 1150 gathered from the detec- 
tor. In each group of orientation vector candidates, there is 
a set of orientation vectors, each corresponding to a block in 
a given frame. The detector invokes die reader for one or 
more orientation vector groups whose detection values 
exceed a ptedetermined threshold. For each such group, the 
detector loops over the blocks in the group (1152), and 
invokes the reader to extract evidence of the raw message bit 

Recall that previous stages in the detector have refined 
orientation vectors to be used for the blocks of a group. 
When it invokes the reader, the detector provides the ori- 
entation vector as well as the image block data (1154). The 
reader scans samples starting from a location in a block 
specified by the translation parameters and using the other 
orientation parameters to approximate the original orienta- 
tion of the image data (115^. 

As described above, the embedder maps chips of the raw 
message bits to each of the luminance samples in the original 
host image. Each sample, therefore, may provide an estimate 
of a chip's value. The reader reconstructs the value of the 
chip by first predicting the watermark signal in the sample 
from tbe value of the sample relative to its neighbors as 
described above (1158). If the deduced value appears valid, 
then the reader extracts the chip's value using the known 
value of the pseudo-random carrier signal for that sample 
and performing the inverse of the modulation function 
originally used to compute the watermark information signal 
(1160). In particular, the reader performs an exclusive OR 
operation on the deduced value and the known carrier signal 
bit to get an estimate of the raw bit value. This estimate 
serves as an estimate for the raw bit value. The reader 
accumulates these estimates for each raw bit value (1162). 

As noted above, the reader computes an estimate of the 
watermark signal by predicting the original, 
un-watermatked signal and deriving an estimate of the 
watermark signal based on the predicted signal and the 
watermarked signal. It then computes an estimate of a raw 
bit value based on the value of the carrier signal, the 
assignment map that maps a raw bit to the host image, and 
the relationship among the carrier signal value, the raw bit 
value, and the watermark signal value. In short, the reader 
reverses the embedding functions that modulate the message 
with the carrier and apply the modulated carrier to the host 
signal. Using the predicted value of the original signal and 
an estimate of the watermark signal, the reader reverses the 
embedding fiinctions to estimate a value of the raw bit. 

The reader loops over the candidate orientation vectors 
and associated blocks, accumulating estimates for each raw 
bit value (1164). When the loop is complete, the reader 
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calculates a final estimate value for each raw bit from the 
estimates compiled for it. It then performs the inverse of the 
error correction codir^ operation on the final raw bit values 
(1166). Next, it performs a CRC to determine whether the 
5 read is valid. If no errors are detected, the read operation is 
complete and the reader returns the message (1168). 

However, if the read is invalid, then the detector may 
either attempt to refine the orientation vector data further, or 
start the detection process with a new frame. Preferably, the 
detector should proceed to refine the orientation vector data 
when the combined detection value of the top candidates 
indicates that the current data is likely to contain a strong 
watermark signal. In the process depicted in FIG. 19, for 
example, the detector selects a processing path based on the 
combined detection value (1170). The combined detection 
value may be calculated in a variety of ways. One approach 
is to compute a combined detection value based on the 
geometric coincidence of the top orientation vector candi- 
dates and a compilation of their figures of merit. The figure 
of merit may be computed as detailed earlier. 

For cases where the read is invalid, the processing paths 
for the process depicted in FIG. 19 include: l)-refine the top 
orientation vectors in the spatial domain (1172); 2) invoke 
the translation estimator on the frame with the next best 
orientation vector candidates (1174); and 3) te-start the 
detection process on a new frame (asstuning an implemen- 
tation where more than one frame is available) (1176). These 
paths are ranked in order from the highest detection value to 
the lowest. In the first case, the orientation vectors are the 
most promising. Thus, the detector re-invokes the reader on 
the same candidates after refining them in the spatial domain 
(1178). In the second case, the orientation vectors are less 
promising, yet the detection value indicates that it is still 
worthwhile to return to the translation estimation stage and 
continue from that point. Finally, in the final case, the 
detection value iodicates that the watermark signal is not 
strong enough to warrant further refinement. In this case, the 
detector starts over with the next new frame of image data. 

In each of the above cases, the detector continues to 
process the image data until it either makes a valid read, or 
has failed to make a valid read afrer repeated passes through 
the available image data. 

5.0 Operating Environment for Computer 
Implementations 

FIG. 20 illustrates an example of a computer system that 
serves as an operating enviromnent for software implemen- 
tations of the watermarking systems described above. The 
embedder and detector implementations are implemented in 
C/C++ and are portable to many different computer systems. 
FIG. 20 generally depicts one such system. 

The computer system shown in FIG. 20 includes a com- 
puter 1220, including a processing imit 1221, a system 
memory 1222, and a system bus 1223 that interconnects 
various system components including the system memory to 
the processing unit 1221. 

The system bus may comprise any of several types of bus 
structures including a memory bus or memory controller, a 
peripheral bus, and a local bus using a bus architecture such 
as pa, VESA, MicroChannel (MCA), ISA and EISA, to 
name a few. 

The system memory includes read only memory (ROM) 
1224 and random access memory (RAM) 1225. A basic 
input/output system 1226 (BIOS), containing the basic rou- 
tines that help to transfer information between elements 
within the computer 1220, such as diuing start-up, is stored 
in ROM 1224. 
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The computer 1220 further includes a hard disk drive 
1227, a magnetic disk drive 1228, e.g., to read from or write 
to a removable disk 1229, and an optical disk drive 1230, 
e.g., for reading a CD-ROM or DVD disk 1231 or to read 
from or write to other optical media. The hard disk drive . 
1227, magnetic disk drive 1228, and optical disk drive 1230 
are connected to the system bus 1223 by a hard disk drive 
interface 1232, a magnetic disk drive interface 1233, and an 
optical drive interface 1234, respectively. The drives and 
their associated computer-readable media provide nonvola- 
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stored in the remote memory storage device. The processes 
detailed above can be implemented in a distributed fashion, 
and as parallel processes. It will be appreciated that the 
network connections shown are exemplary and that other 
means of establishing a communications link between the 
computers may be used. 

While the computer architecture depicted in FIG. 20 is 
similar to typical personal computer architectares, aspects of 
the invention may be implemented in other computer 



tile storage of data, data structures, computer<xecutable ^° architectures, such as hand-held computing devices like 

instructions (program code such as dynamic link libraries, Personal Digital Assistants, audio andMdeo players, net- 

and executable files), etc. for the computer 1220. work appliances, telephones, etc. 

Although the description of computer-readable media "Smart Images" Using Digimarc's Watermarking Technol- 

above refers to a hard disk, a removable magnetic disk and °gy 

an optical disk, it can also include other types of media that This section introduces the concept of Smart Images and 

are readable by a computer, such as magnetic cassettes, flash explains the use of watermarking technology in their imple- 

memory cards, digital video disks, and the like. mentation. A Smart Image is a digital or physical image that 

A number of program modules may be stored in the drives contains a digital watermark, which leads to further infor- 

and RAM 1225, including an operating system 1235, one or mation about the image content via the Internet, communi- 
more application programs 1236, other program modules ^ cates ownership rights and the procedure for obtaining usage 

1237, and program data 1238. rights, facilitates commerce, or instructs and controls other 

A user may enter commands and information into the computer software or hardware. Thus, Smart Images, 

computer 1220 through a keyboard 1240 and pointing empowered by digital watermarking technology, act as 

device, such as a mouse 1242. Other input devices may ^'^'Y^ ^cn\s or caUlysts which gracefully bridge both 

include a microphone, joystick, game pad, satellite dish, traditional and modem electronic commerce. This section 

digital camera, scanner, or the hke. A digital camera or presents the use of Digimarc Corporation's watermarking 

scanner 43 may be used to capture the target image for the technology to implement Smart Images. The section pre- 



detection process described above. The camera and s 
are each connected to the computer via a standard interface 
44. Currently, there are digital cameras designed to interface 
with a Universal Serial Bus (USB), Peripheral Component 
Interconnect (PCI), and parallel port interface. Two emerg- 
ing standard peripheral interfaces for cameras include USB2 
and 1394 (also known as firewire and iLink). 

Other input devices may be connected to the processing 
unit 1221 through a serial port interface 1246 or other port 
interfaces (e.g., a parallel port, game port or a universal 

serial bus (USB)) that are coupled to the system bus. „ „ ^ 

A monitor 1247 or other tjje of display device is also 40 Ind* totematiOTd reilrcria^itations^nro'^ariMtio^^ 
connected to the system bus 1223 via an interface, such as ji^i^ tremendous growth has resulted in many advances 
a video adapter 1248. In addition to the monitor, computers benefiting the imaging field and its appUcations. For 
typically mclude other penpheral output devices (not example, affordable high-resolution scamiers and digital 
shown), such as speakers and prmters. cMqs cameras (cameras on chips) are widely used. Color 

The computer 1220 operates m a networked environment 45 printers and color laser copiers have become very affordable, 
using logical connections to one or more remote computers. Professional image editing and manipulation software pack- 



sents an application that demonstrates how Smart Images 
facilitate both traditional and electronic commerce. This 
3 section also analyzes the technological challenges to be 
faced for ubiquitous use of Smart Images. 

1. INTRODUCTION 
Since the dawn of history, images have been used to 
5 communicate information in many applications and for 
many different purposes. In the recent times, capturing, 
storing, editing, retouching, printing, copying, and transmit- 
ting high quality colored images have become a multi- 
biUion dollar industry, as well as a primary focus of national 



such as a remote computer 1249. The remote computer 1249 
may be a server, a router, a peer device or other common 
network node, and typically includes many or all of the 
elements described relative to the computer 1220, although 
only a memory storage device 1250 has been illustrated in 
FIG. 20. The logical connections depicted in FIG. 20 include 
a local area network (LAN) 1251 and a wide area network 
(WAN) 1252. Such networking environments are common- 



have been developed for the PC and Mac platforms, 
and are available at very affordable prices. The speed and the 
storage capacity of hard disks, CD-ROM, DVD, and optical 
storage devices have increased tremendously to allow for the 
display and storage of a very large number of high- 
resolution images and video sequences. Affordable, ultra- 
fast computing platforms have become available for ofiSce 
and home use. The high-speed Internet backbone has 



place in ofSces, enterprise-wide computer networks, intra- 55 become ubiquitous, and high-speed modems have become 



Is and the Internet. 

When used in a LAN networking environment, the com- 
puter 1220 is connected to the local network 1251 through 
a network interface or adapter 1253. When used in a WAN 
networking environment, the computer 1220 typically 
includes a modem 1254 or other means for establishing 
communications over the wide area network 1252, such as 
the Internet. The modem 1254, which may be internal or 
external, is connected to the system bus 1223 via the serial 
port interface 1246. 

In a networked environment, program modules depicted 
relative to the computer 1220, or portions of them, may be 



the standard entry-level Internet cotmection. Powerful image 
compression algorithms such as JPEG, and Internet brows- 
ers that are able to upload, download, and view high- 
resolution images are currently in general use on the Inter- 
D net. So enabled, more and more images appear in the 
physical and digital world around us. 

Media producers have become justifiably concerned 
about copyright protection of digital images, since unautho- 
rized copies of digital images are very easy to make. Hence, 
> early research efforts have focused on digital watermarking 
technology as a technique to communicate and enforce 
copyrights, detect counterfeit copies, and deter improper use 
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of digital media in general, and digital images in particular. specialized software or hardware to induce the best utiliza- 
Digital watermarking technology allows the user to embed tion of the image. This data may be self-contained or it may 

digital messages within media content. These digital mes- include pointers to a complete knowledge structure on a 

sages are imperceptible to humans but can be read by local database or on the Internet (FIG. (21)). This knowledge 

computers and q)ecialized devices. In an early watermark- 5 structure may include information about ownership rights, 

ing technique, ones and zeros in a watermark payload are image creation, image content, and instructions for the 

encoded by increasing or decreasing the pixel vdues around software and hardware that may process the image. The 

selected "signature" points. This technique is detailed in a dormant data is interwoven with the media content and 

patent filed by Corbis and now owned by Digimarc Corpo- cannot be easily removed from the image without degrading 

ration. In another technique, the ones and zeros are encoded lO the image quality. This data travels with the image and 

by stunming or subtracting an ensemble of imcorrelated survives image processing and manipulation operations, 

noise frames from an image. Again, this technique is such as scaling, rotation, cropping, filtering, compression, 

detailed in a patent owned by D^imarc. Both techniques are and digital-to-analog (e.g. printing) and analog-to-digital 

sensitive to visibility (and audibility) concerns and tailor the (e.g. scanning) conversion. Parts (3) and (4) explain how this 

encoding to exploit data hiding features of the underlying is data can be added to the image. 

content. Hence, with Digimarc's PictureMarc, a visually Different regions of a Smart Image may carry independent 

imperceptible signal can be embedded in a digital still dormant information. Hence, different parts of the image 

image. This signal can be detected and read with Digimarc's lead to different knowledge structures or instruct the soft- 

Plug-in detector, which is integrated into leading image ware and hardware differenfly. This is useful in applications 

editing software. Whenever the image editing software 20 where images contain more than one object. For example, 

opens an image file, the detector automatically detects such FIG. (22) shows a Smart Image that represents a typical 

watermarks. The user of the image editing software can then promotion ad in a newspaper. The image contains two 

read the watermark and determine the owner of the image. regions, each with different dormant information. The first 

Similarly, Digimarc's MarcSpider scans the Internet looking region surrounds the JVC video camera and contains dor- 

for images with a watermark and reports the locations of 25 mant information related to that camera. The second region 

watermarked images to the registered owner for fiirther surrounds the Sony video camera and contains dormant 

actions. information related to that camera. 

In this section, the use of digital watermarking technology 2.2. How Smart Images are Different 

is expanded beyond copyright protection. Digital water- Adding imperceptible dormant information to the image 

marking technology is used to facilitate both traditional and 30 facilitates image interpretation. In general, image interpre- 

electronic commerce. In both types of commerce, still tation requires the use of intelligent pattern-recognition 

images are extensively used, but their full potential is not algorithms that are extremely hard to design. These algo- 

currently exploited. Images processed by these applications rithms exploit the image data itself and do not require 

are used for advertising and promoting products in additional information. Although this field is very attractive, 

magazines, newspapers, and in the greatest show on earth: ^5 it has had very limited success in some industrial applica- 

±e Internet. The adage "a picture is worth a thousand tions and its general use is still a challenging research area, 

words" is the basic driving force behind this use. Simply put. However, by adding dormant information to the image, the 

a picture inherently conveys much more information to the image becomes smarter, and the image interpretation prob- 

consumer than text or audio alone. With the advent of digital lem is reduced to detecting and reading the embedded 

watermarking technology, the image can now be embedded information using sophisticated signal processing algo- 

with a digital watermark that is imperceptible to the user. rithms. 

This watermark can be embedded in digital images as well dormant data in a Smart Image is different from the 

as in printed pictures, and it contains additional information header, encapsulated information, or metadata (additional 

that remains dormant until the proper software or hardware information about the data) often added to a digital image 

detects it. When this additional information is retrieved, it •'^ file to facilitate file manipulation and display. Metadata 

can be displayed to the user, used to obtain more information structures are used to provide unique identifying information 

from the Internet, or used to control the software or hard- atx)ut digital images. These data structures contain text data 

ware that is processing the image. This dormant information ^ appended to the image files rather than embedded 

gives the image some intelligence, hence we have coined the within the image itseff. Therefore, once the digital image is 

term Smart Image. Since a Smart Image contains a water- printed on paper, all the metadata structure is left behind, 

mark that leads to more information about the image, it Moreover, metadata has the disadvantage of increasing the 

could be said that "a Smart Image is worth more than a sizeof the image file and generally may not survive a change 

thousand words." in the image format (e.g., from TIFF to JPEG or vice versa). 

Part (2) of this section further explains the concept of '^^ contrary, the data in a Smart Image is interwoven 

Smart Images. Part (3) presents a brief overview of Digi- ^he image and survives printing and image refonnat- 

marc Corporation's digital watermarking technology. Part ^J^- ™* ^ selected to provide unique identifica- 

(4) demonstrates how a Smart Image system creates a bridge information about the image, and used instead of 

between traditional and electronic commerce. Part (5) ana- metadaU to facilitate the archiving, indexing, cataloging, 

lyzes the technological challenges to be faced for ubiquitous previewing, and retrieving of digital images, 

utihzation of Smart Images. The last part of this section *° Smart Images are also different from "DataGlyphs," 

presents some conclusions. which was recently introduced by Xerox Corporation. 

"DataGlyphs" encodes machine-readable data onto paper 

2. SMART IMAGES documents to facilitate document processing. The idea is 

2.1. Definition similar to the ubiquitous bar codes on consumer products. 

In this section, a Smart Image is defined as a digital or 65 Insteadof vertical line segments of differing widths, the data 

physical still image that conUins visually imperceptible data is encoded as small 45-degree diagonal lines called glyphs, 

that remains dormant until it is detected and retrieved by Each of these lines represents a single binary 0 or 1, 
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depending on whether it slopes to the left or right. Sequences 
of these glyphs can be used to encode numeric, textual, or 
other information. These glyphs are then printed on the 
document as visible gray patterns, which can appear as 
backgrounds, shading patterns or conventional graphic : 
design elements. Although the presence of these patterns 
may go unnoticed in text documents, it introduces a major 
degradation in quality when added to a natural picture. 

Smart Images are different from images with hot spots, 
usually encoimtered in interactive multimedia applications i 
or Internet browsing. Images with hot spots ate usually 
dummy bitmaps that are used as a graphical interface to 
guide the user to select ±e proper choice during an inter- 
active session. They contain no additional information 
beyond the face value of the image; hence they contain no i 
intelligence. All apparent intelligence is due to the associ- 
ated multimedia program. Replacing an image of this kind 
with another image of the same size would have no impact 
on the program as long as the user remembers where on the 
image to click in order to activate a desired choice. 2 
Similariy, copying the image to another application and 
clicking on any of its hot spots would not cause anything to 
happen. On the other hand, Smart Images are independent of 
the software or hardware that may process them. The 
information they contain is what gives the software or 2 
hardware the desired intelligence. Replacing a Smart Image 
with an ordinary image will deprive the software or hard- 
ware of its apparent intelligence. Moreover, using a Smart 
Image with any software or hardware that is enabled to 
exploit the dormant information will produce the same 31 
desired effect. 



Digital watermarking technology can be used for embed- 35 
ding dormant information into Smart Images. For this 
purpose, a useful and effective watermarking technology 
must provide a method to embed data invisibly, promote a 
high information rate or capacity, allow the embedded data 
to be readily extracted by hardware or software, require 40 
minimum processing time, and incorporate a fair amount of 
robustness against standard image manipulation operations 
and basic attacks. Although Smart Images are expected to be 
used for facilitating commerce, immunity to basic attacks is 
stUl required for some applications such as those needed to 45 
commtmicate ownership rights. Digimarc Corporation has 
developed a commercially available technology that meets 
all these requirements. Digimarc's digital watermarking 
technology can be classified as a mixed domain technique, 
since it embeds signals in the frequency as well as in the 50 
spatial domain representation of the image. The frequency 
domain signal is used for synchronization purposes, while 
the spatial domain signal contains the payload. 
3.1 The Embedder 

The process of embedding a digital watermark into an 55 
image using Digimarc's watermarking technology can be 
siunmarized as follows. First, the image is divided into 
blocks of NxM pixels. Then the watermark is independently 
embedded in each of these blocks. This allows the water- 
mark to be detected from an image region as small as NxM 60 
pixels. Spread spectrum techniques are used to make the 
signal imperceptible and to combat the effect of image 
manipulation and filtering. Let Wo(n)-{woj,W(,^, K, Wq , 
Wq J be the watermark signal to be embeddal in' the image, 
where Wo^-{-l,l}. The amount of information to be embed- 65 
ded determines the length of the vector Wg(n). This amount 
of information shotild not exceed the channel capacity 
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represented by the original image. Error correction tech- 
niques such as Bose-Chaudhuri-Hocquenghem (BCH) or 
Convolutional Codes are first applied to Wo(n) in order to 
produce a robust signal, w^(n)-{w w JC, w w }, 
where L>I. Also, let K,<n).{k,. Jc,.^,KX. ^,%} be a^t o^L 
pseudo-random binary keys, where k,--{-l,l} and JxL-Nx 
M. Each of these keys is associated with one of the bits in 
or-ptotected watermark, W,^n). These random keys 
5t used to spread each of the bits of the watermark 



are first used to spread e 
signal, w,p(n), to produce C/n), which is a 



watermark 
of length 



(1) 



Also, let IXm,n) be an NxM matrix that maps each of the 
bits of C/n) to a particular location in the NxM space. The 
locations of all the bits that belong to C^n) are marked as 1 's 
in the NxM binary mask M,.(m,n) and everything else is 
marked as 0. Also, each mask is orthogonal to all the masks 
associated with the other bits; i.e. 



S,Kn)-JWX'".")C,(l,(m,n)) 



(2) 



The above process is similar to data interleaving in spread 
spectrum communications, which is used to combat burst 
error. Finally, the sum of the scattered bits is added to the 
image, P(m,n), to produce the watermarked image, P„(m,n). 



(3) 



where a„ „ is a gain coefBcient that is calculated based on 

the image properties aroimd location (m,n) in the block. A 

synchronization signal is also added in the process to aid 

detection. 

3.2. The Detector 

The detector reverses the operation of the embedder. It 
starts by extracting the synchronization signal from the 
frequency domain of the image. It then uses this signal to 
resolve the scale, orientation, and origin of the watermark 
signal. Finally, it reads and decodes the watermark signal. 
Since the detector does not use the original image, P(m,n), 
the read process starts by estimating the watermark signal 
from P„(m,n). In this case, the original image P(m4i) is 
considered to be noise, or a noisy two-dimensional chatmel. 
Since the pixels of the original image are assimied to be 
highly correlated locally, the digital value of the spread 
watermark signal can be estimated by first predicting the 
original pixel value, P(m,n), using the local properties of the 
image, then subtracting it from P,^m,n). TTiis produces an 
image representing the scattered watermark 



The normalized scatter of each bit, §j(™>n)' be extracted 
from §(m,n) using M,-(m,n). An inverse mapping procedure 
is used to reconstruct an estimate of C,<n) according to the 
following equation: 
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d.iI(m,n))S,(,m,^) (5) 
<^/W-»'.„x*:,(n)+Ti(«) (6) 

where Ti(n) is additive interference. Now, an estimate of the 
error-protected watermark can be obtained by correlating the 
received signal for each bit with its associated key. Hence, 



bridge between the printed material and the Internet, per- 
mitting users to link directly to relevant Web destinations 
without any typing, mouse clicks, or time consuming search- 
ing. This provides physical media with digital capabilities, 
allowing new forms of interaction with the digital world, 
thereby enhancing publishing, advertising, and electronic 



In the above equation, multiplying K,<n) by the interfer- 
ence T|(n) spreads the power of Ti(n) over a much wider 
frequency band. This is similar to spreading the power of the 
original watermark signal as in equation (1) above. 
Moreover, summing the Ti(n)xK,<n) from n-1 to J, is in 
essence a low pass filtering of the resulting wide band 
interference. The result of this filtering is <|), which is a zero 
mean random variable with a small variance. This filtering 
has only amplification effect on w,^^ since it is assumed a 
narrow band signal. Hence, if w^^is 1, the above operation 
produces a positive peak; otherwise, it produces a negative 
peak. Thresholding the resulting value at zero produces an 
estimate of the binary error protected watermark signal. 
Finally, the estimated watermark vector ^^,p(n)-{w, , 
w,p^ 'f'^j,^}, is error corrected to produce the embeddecl 
watermark signal W(,(n)-{woj,Wo^, K,Wo WqJ. 

Though detection as a concept is best illustrated using 
classic hnear correlation, it is well known in the field of 
digital communication that a wide variety of non-linear 
techniques tend to optimize the detection performance itself. 

4. SAMPLE APPUCAHON 



In this section, we describe Digimarc's MediaBridge, 
which is a Smart Image system that creates a bridge between 
traditional commerce and electronic commerce (FIG. (23)). 
It presents a fundamentally new way to access and use tiie 
Internet. In this application, Digimarc's watermarking tech- 
nology is used to embed digital watermarks in printed 
images such as magazine advertisements, event tickets, CD 
covens, book covers, direct mailers, debit and credit cards, 
greeting cards, coupons, catalogs, business cards, and goods 
packaging. As shown in FIG. (24A), creating a Smart Image 
is very simple. The process starts with a digital image, on 
which the watermark is embedded as described in part (3) 
above. This produces a Smart Image in digital form. Finally, 
the digital Smart Image is printed and publidied using a 
normal screen printing process. 

When the user produces a digital image of one of these 
printed Smart Images via a flatbed scanner or a digital 
camera, the Smart Image application or the input device (or 
its software driver) detects and reads the embedded water- 
mark (FIG. (24B)). The embedded watermark represents an 
n-bit index to a database of URLs stored on a known 
location on the Internet, e.g., the Digimarc server. This index 
is used to fetch a corresponding URL bom the database. 
Then the URL is used by the Internet browser to display the 
related Web page or start a Web-based application specified 
by the creator of the image. Hence, MediaBridge creates a 



4.2. Advantages of the MediaBridge System 

Embedding imperceptible digital watermarks offers sev- 
eral advantages over printing the URL on an advertisement. 
First, using digital watermarks does not require any real 
estate of the image and thus preserves the image quality. 
Presenting the URL on the image consumes some of the 
image's valuable real estate and degrades image quality. 
Second, MediaBridge does not require the user to type the 

1' URL in order to access the Internet. Typing URLs, especially 
long ones, can be confusing and error prone, and may hinder 
some users from accessing the Internet. ITurd, the imper- 
ceptible watermarks can be language-dependent and allow 
better tracking of advertisements. Depending on the lan- 

20 guage of the advertisement, a corresponding code can be 
embedded to allow the user to go directly to a Web page with 
the same language as that of the advertisement. Similarly, 
different watermarks can be used for different publications 
to allow advertisers to track their advertisements and opti- 

25 mize their advertising campaign. With printed URLs, this 
can be achieved only by using very long URLs, which is 
clearly undesirable. 

MediaBridge offers great flexibility to advertisers. Once 
an image is embedded with the desired digital watermark, 

30 the knowledge structure at the advertiser's server can be 
relocated or updated as desired without re-embedding, 
re-printing and re-publishing the advertisement. If the 
knowledge structure has been relocated, the advertisers need 
only update the related URL at Digimarc's server, so that the 

35 new Web page will be displayed to the user once the input 
device detects one of their advertisements. 

MediaBridge also has several advantages over traditional 
Internet browsers. When using an Internet browser to 
retrieve desired information from the Internet, the user is 

40 confronted with multiple Web sites and information over- 
load. Most of these Web sites are very confusing, deep, and 
often loaded with graphics, images, or animation. Searching 
these Web sites to retrieve the desired information over a 
slow Internet link can be time consuming and frustrating, 

45 especially to Internet novices. MediaBridge, on the other 
hand, retrieves the desired information directly and quickly 
by showing a Smart Image to the PC camera or scanning it 
with a scanner. No browsing of several Web pages is 



50 4.3. System Requirements 

A typical PC configuration for use with MediaBridge is a 
233 MHz Pentium CPU, 32 Mbytes of DRAM, a 1 G byte 
hard disk, and attached PC video camera or scanner. 
However, better PC configuration would enhance the per- 

55 formance of the application. The PC must also run the 
MediaBridge software. The PC may be connected to the 
Internet through a dial up modem or a direct LAN connec- 
tion. However, for fast retrieval of the information a direct 
LAN connection or very high-speed modem is highly rec- 

60 ommended. The digital camera may be either a still or a 
video camera. A good quaUty CCD (Charge Coupled 
Device) digital camera provides the best MediaBridge per- 
formance. Also, an analog camera connected to a classic 
video capture board could be used instead of the digital 

65 camera. 

A digital camera or scanner is needed only when dealing 
with Smart Images in printed form. They are not needed if 
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the image is already in digital form, as is the case when the 
image is posted on the Internet. In this case, Internet 
browsers such as Netscape Navigator and Microsoft Internet 
Explorer could be enhanced to include the watermark reader. 
Also, Internet browser capabilities can be enhanced further ; 
to display an icon on a comer of the image to indicate a 
hotspot when a Smart Image is encountered. When the user 
clicks on this hot spot, the browser displays a special menu 
that is tmique to that image, guiding the user and suggesting 
further action. The uf ' 
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a special offer on a music CD might be available only to 
ticket holders. For an airline ticket, the current state of the 
travelers frequent flyer account could be displayed. In 
addition, watermarking technology could be used to detect 
counterfeit tickets, which are becoming a large problem 

Smart Images can also be used in Edutainment. It can 
whisk a child into the exiting world of children books. 
Simply insert the CD, which comes packaged with a "Smart 
may then select any of the displayed lO Book," let the child show any page of the book to the digital 
^w^^ .^^^ ^ .vuii-Yt more information from the Internet camera, and page by page, the story tmfolds. A pre-reader 
and maxmiize his information gain. The content of the menu can hear a story read out loud. An older reader can follow 
and Its associated pointers is retrieved from a central server along at his own level, or listen to a story that is too 
such as Digimarc's server advanced to read alone. As the story unfolds, animation, 

4.4. Usage Examples 15 songs, and exciting graphics carry the child along on a 

Smart Images can be used in a variety of ways to facilitate reading adventure. For activity books, the computer can also 



commerce. For example, if a reader wants 
information about an advertised product in a magazijie, he 
simply shows the ad to the camera and goes directly to a 
precise location on the advertiser's Web site. He can get all 
the product details and specifications, locate a local dealer, 
or order online. If the reader wants to get more information 
on the subject of one of the articles in a magazine, he can 
hold the article up to the camera, and start reading. This 
allows him to go directly to other Web sites, where he can 
find and even order online related books, articles, etc. If the 
reader wants to subscribe to a magazine, he simply shows 
the front cover or the subscription card of the magazine to 
the camera. Subscription information appears and he sub- 
scribes on line. Similarly, if the reader wants to take advan- 30 
tage of an appealing offer found in a magazine, instead of 
calling an 800 number, he can go an easier route and apply 
online, immediately receiving all the associated promotions. 

Smart Images can also be used to promote the sale of 
audio CDs, DVD movies, and books. For example, assume 35 
a consumer has just bought a CD of his favorite artist and is 
interested in other music by the same artist. Simply showing 
the back cover of the CD to the camera takes him to a Web 
site to purchase other CDs from the artist's collection. Or, if 
he is interested in the artist's latest song, he just holds the 40 
front of the CD in front of the camera and listens. In this 
case, the Internet browser first launches an MP3 audio 
player. Then it starts playing from the appropriate Web site 
a WAV file representing the latest song. The same idea can 
be used with DVD movies. If the consumer holds the DVD 45 
movie cover in front of the camera, the Internet browser first 
launches MediaPlaycr. Then it sUrts playing a trailer of the 
main star's latest movie from the appropriate Web site. 
Similarly, showing the cover of a book to the camera takes 
the consumer to a site where he can order the book or see a 
list of books about the same subject or a fist of books by the 
same author. Moreover, showing the book cover to the 
camera may play a trailer of a movie about the book, if there 
is one. It also allows the consumer to buy the movie online. 

Smart Images have interesting uses with tickets for sport- 55 
ing events and concerts. For example, before a game a sports 
fan holds the front of an admittance ticket up to a PC camera. 
A Web page is displayed that shows the location of his 
stadium seat, a map of how to find the seat, and a view of 
the field from that seat. By showing the back of the ticket, 60 
the sports fan might see promotional material and merchan- 
dise for the event. After the event has taken place, showing 
the same ticket to the PC camera might take the sport fan to 
a Web page with detailed scoring information, game 
highlights, related links and merchandise specially dis- 65 
counted for tidcet holders. Other types of events could have 
their own q)edal information. For example, after a concert. 



give verbal directions when the child shows a page to the 

The list of possible applications of Smart Images is 
20 growing every day and is limited only by the imagination. 

5. TECHNOLOGICAL CHALLENGES AND 
REQUIRED INFRASTRUCTURE 
Full utilization and deployment of Smart Images involves 
25 facing several challenges, which include the following: 
1. Smart Images either include pointers to knowledge struc- 
tures on a local database or on the Internet, or they are 
self-contained. When a Smart Image contains pointers, 
the embedded information is a few bytes representing the 
30 pointers. When a Smart Image is self-contained, the 
image itself contains all the desired information. In most 
applications it is necessary to embed as much information 
as possible into the Smart Image without degrading the 
image quality. Although the amount of information that 
can be embedded highly depends on the nature of the host 
image, it is also limited by information theory. The 
amount of information is further reduced by the need to 
improve detectability of the watermaric signal by repeat- 
ing the watermark over several regions of the image. 
Hence, there is a three-way trade-off between image 
quality, information rate, and detectability of the water- 
mark. Increasing the information rate might be at the 
expense of watermark detectability. In most applications 
of Smart Images, the visual quality of the image is 
extremely important. While decreasing the visibility of 
the watermark preserves image quality, it automatically 
decreases watermark detectability. Automatic optimiza- 
tion of these three conflicting requirements is a challenge 
that warrants further research and development. 
50 2. Although embedding speed is not critical, detection speed 
is crucial to using Smart Images. Watermark embedding 
can be achieved off-line, but in order to avoid user 
frustration, watermark detection must be accomplished as 
quickly as possible. Most printed advertisements and 
55 pictures are large in size and produce huge digital files 
when exposed to the PC camera. Processing this amount 
of data in real time is a challenging task. The frame rate 
of most video cameras is at least 10 frames/second. If 
detection is not accomplished as soon as the image is 
captured, the user may think that the ad is not placed 
properly in front of the camera. So, the user may move the 
picture to change the distance or the orientation angle in 
an attempt to improve detection. This would, in fact, 
cause further delay and may even make detection impos- 
sible. Buffering one frame and ignoring subsequent 
frames until watermark detection is complete may help 
speed up the detection process, as long as the detector 
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succeeds in reading the watermark. Another way is to 
quickly examine the entire frame to locate the region with 
the strongest watermark signal and then process only that 
region. If the frame does not contain a strong watermark 
signal, the detector would quickly discard the entire frame 5 
and start searching a new frame. The fundamental 
solution, however, is to face the basic challenge of feed- 
ing up the watermark detector, which is heavily loaded 
with many sophisticated signal processing techniques. 

3. The size of an image captured by a digital camera highly lo 
depends on the distance of the object from that camera. 
Also, the size of an image captured by a scaimer depends 
on the used scanning resolution. To correctly read the 
watermark the reader must precisely know the scale of the 
image. Although a watermark detector such as Digimarc's 15 
detector is capable of determining this scale from the 
captured image, more robustness to variations in scale, 
especially robustness to a wider range, is still necessary. 
Moreover, holding a Smart Image in front of the camera 

at an arbitrary distance risks that the camera wiU not be 20 
focussed. Although expensive cameras may have an auto- 
focusing capability, the lenses on most economical cam- 
eras must be focused manually. Hence, these cameras may 
capture out-of-focus (blurred) images. This is similar to 
convoluting the image and the embedded watermark 25 
signal with a blur function. Detecting blurred images and 
estimating the parameters of the blur function help to 
de-blur these images to recover the watermark signal. 

4. To correctly read the watermark in a Smart Image the 
reader must know the precise orientation angle of the 30 
image. With scanners, this rotation angle is simple since 

it is limited to rotation in the scanner's plane. However, 
with digital cameras, this rotation angle can be arbitrary 
with three degrees of freedom. Although a watermark 
detector, such as Digjmarc's detector, is capable of deter- 35 
mining orientation in the scanner plane or on a plane 
perpendicular to the focal axis of the camera, arbitrary 
orientation is still a major challenge. This arbitrary ori- 
entation may cause the embedded signal to suffer from 
geometrical distortion. Geometrical distortion also occurs 40 
from bending, crumbling, or folding the picture. This 
distortion is similar to the jitter in spread spectrum 
communication. In this case, the distance between the 
chips of the spread signal becomes irregular, and 
de-spreading would not produce the correct signal. Esti- 45 
mating this geometrical distortion and correcting it during 
the process of reading the watermark remains a challeng- 
ing problem to watermark detectors. 

5. Some video cameras produce an interlaced output. When 
one of these cameras is used with Smart Images, the 50 
detector must operate on fields rather than frames. By 
definition, a field contains either the odd or even lines of 

a frame, and two consecutive fields originate from two 
consecutive frames. Hence, the detector must combine 
two fields to compose a frame and avoid a major degra- 55 
dation of the watermark signal. The process of combining 
the fields also must compensate for the motion between 
the fields. This process is practical if the frame rate of the 
camera is high enough and if the user does not frequently 
move the image in front of the camera. 60 

6. The printing process may degrade the embedded water- 
mark signal. Digital watermarldng is normally performed 
using digital images represented in the RGB or CYMK 
color space at 300 DPI (dots per inch). The watermarked 
images are then printed on paper with a screen-printing 65 
process that uses the CYMK subtractive color space at a 
Hne per inch (LPI) ranging from 65 to 200. 133 Unes/in is 
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typical for quality magazines and 73 lines/in is typical for 
newspapers. In order to produce a good image quality and 
avoid pixelization, the rule of thumb is to use digital 
images with a resolution (DPI) that is at least twice the 
press resolution (LPI). This is due to the use of halftone 
printing for color production. Also, different presses use 
screens with different patterns and line orientations and 
have different precision for color registration. Hence, one 
challenge is to perform in-depth characterization of the 
printing process and optimize the watermaik embedding 
and reading processes based on this characterization. 

7. A related challenge addresses the variety of papers. Papers 
of various qualities, thickness, and stifhess, absorb ink in 
various ways. Some papers absorb ink evenly, while 
others absorb irik at rates that vary with the changes in the 
paper's texture and finish. This may degrade the embed- 
ded watermark signal when a digitally watermarked 
image is printed. A suitable classification and character- 
ization of paper will lead to ways of embedding digital 
watermarls that compensate for this printing-related deg- 
radation. 

8. Most CCD and CMOS cameras use an array of sensors to 
produce colored images. This requires dividing the sen- 
sors in the array among the three primary colors red (R), 
green (G), and blue (B) according to a specific pattem. All 
the sensors that are designated for a particular color are 
dyed with that color to increase their sensitivity to the 
designated color hence producing the desired color. Most 
camera manufacturers use Bayer color pattern GR/BG. 
Although this pattem proved to produce good image 
quality, it causes color miss-registration that degrades the 
watermark signal. Moreover, the color space converter, 
which maps the signal from the sensors to YUV or RGB 
color space, may vary from one manufacturer to another. 
Accounting for the Bayer color pattem durit^ the color 
mapping process wouU improve the detection of the 
watermark signal. 

9. Different input devices introduce different types of dis- 
tortion. For example, cameras made by different manu- 
facturers may have different sensitivities to light. Their 
lenses may cause different spherical distortions and their 
sensors may have different noise characteristics. 
Moreover, due to the underlying technology, CCD cam- 
eras typically produce better image quality than CMOS 
cameras. Similarly, flatbed scanners are of various quali- 
ties. Some of them have poor color reproduction or 
introduce a slight distortion in image aspect ratio. Also, 
some scanners introduce aliasing and employ interpola- 
tion to increase the scanning resolution. Accounting for 
these differences and addressing these problems in the 
design of the watermark embedder and detectors remain 
a challenge awaiting a solution. 

10. Unlike digital images, printed images do not maintain 
their qualities. They are subject to aging, soiling, 
crumbling, tearing, and deterioration. Moreover, they may 
be used in varied lighting conditions. Hence, designing a 
watermark detector that is immune to these un-intentional 
attacks and works for any lighting condition is another 
challenge to be addressed. 

6. CONCLUSIONS 
In this section, we introduced the concept of Smart 
Images and explained the use of Digimarc Corporation's 
digital watermarking technology in their implementation. A 
Smart Image is a digital or a physical image that is embed- 
ded with a specialized digital watermark. The digital water- 
mark acts as an active agent or catalyst that empowers the 
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Smart Image with efScient access to further, specific infor- 
mation about the image content. This may be "direct" 
information such as ownership and usage rights, or more 
importantly, it may be information located on local data- 
bases or on specific Web pages on the Internet, information 5 
that facilitate e-commerce, or information that instructs and 
controls further computer software or hardware actions. 
Thus, the systems that implement Smart Images create a 
graceful bridge between physical space and the virtual space 
of the Internet. Full utilization of Smart Image requires lO 
improving the watermarking embedding and detection pro- 
cesses to operate very efEciently on a variety of environ- 
ments and conditions. Smart Images is the first step in 
seamlessly linking content to people, places and things and 
can also be extended to other multimedia elements such as is 
audio and video. 

6.0 Concluding Remarks 

Having described and illustrated the principles of the 
technology with reference to specific implementations, it 
win be recognized that the technology can be implemented 
in many other, different, forms. The techniques for embed- 
ding and detecting a watermark may be applied to various 
types of watermarks, including those encoded using linear or 
non-linear functions to apply a watemjark message to a host 
signal. As one example, embedding methods, such as meth- 
ods for error correction coding, methods for mapping water- 
mark messages to the host signal, and methods for redun- 
dantly encoding watermark messages apply whether the 
watermarking functions are linear or non-linear. In addition, ^ 
the techniques for determining and refining a watermark's 
orientation apply to linear and non-linear watermark meth- 
ods. For example, the methods described above for detecting 
orientation of a watermark s^nal in a potentially trans- 
formed version of the watermarked signal apply to water- 
mark systems that use different methods for embedding and 
reading messages, including, but not limited to, techniques 
that modulate spatial or temporal domain intensity values, 
that modulate transform coeflScients, that employ dither 
modulation or quantization index modulation. 

Some of the detector methods described above invoke a 
watermark message reader to assess the merits of a given 
orientation of a watermark signal in a potentially trans- 
formed version of the watermarked signal. In particular, 
some of these techniques assess the merits of an orientation 
by invoking a reader to detennine the extent to which known 
message bits agree with those read from the watermarked 
signal using the orientation. These techniques are not spe- 
cific to the type of message encoding or reading as noted in 
the previous paragraph. The merits of a given estimate of a 
watermark signal's orientation may be assessed by selecting 
an orientation parameter that increases correlation between 
the watermark signal (or known watermark signal attributes) 
and the watermarked signal, or that improves recovery of jj 
known watermark message bits from the watermark signal. 

Some watermark readers extract a message from a water- 
marked signal by correlating known attributes of a message 
symbol with the watermarked signal. For example, one 
symbol might be associated with a first pseudorandom noise 
pattern, while another symbol is associated with another 
pseudorandom noise pattern. If the reader determines that a 
strong correlation between the known attribute and the 
watermark signal exists, then it is likely that the water- 
marked signal contains the message symbol. 55 

Other watermark readers analyze the watermarked signal 
to identify attributes that are associated with a message 
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symbol. Generally peaking, these watermark readers are 
using a form of correlation, but in a different form. If the 
reader identifies evidence of watermark signal attributes 
associated with a message symbol, it notes that the associ- 
ated message symbol has likely been encoded. For example, 
readers that employ quantization index modulation analyze 
the watermarked signal by applying quantizers to the signal 
to determine which quantizer was most likely used in the 
embedder to encode a message. Since message symbols are 
associated with quantizers, the reader extracts a message by 
estimating the quantizer used to encode the message. In 
these schemes, the signal attribute associated with a message 
symbol is the type of quantization applied to the signal. 
Regardless of the signal attributes used to encode and read 
a watermark message, the methods described above for 
determining watermark orientation and refining orientation 
parameters still apply. 

To provide a comprehensive disclosure without unduly 
lengthening the qiecification, applicants incorporate by ref- 
erence the patents and patent apphcations referenced above. 
Additional information is attached in the section, entitled 
"Smart Images" Using Digimarc's Watermarking Technol- 
ogy. This section describes additional embodiments and 
applications of watermark embedding and detecting tech- 
nology. For additional information about a detector optimi- ■ 
zation that looks for a watermark in portions of a signal that 
are more likely to contain a recoverable watermark signal, 
see U.S. patent application Ser. No. 09/302,663, filed Apt. 
30, 1999, entided Watermark Detection Utilizing Regions 
with Higher Probability of Success, by Ammon Gustafson, 
Geoffrey Rhoads, Adnan Alattar, Ravi Sharma and Qay 
Davidson (Now U.S. Pat. No. 6,442,284). 

The particular combinations of elements and features in 
the above-detailed embodiments are exemplary only; the 
interchanging and substitution of these teachings with other 
teachings in this and the incorporated-by-reference patents/ 
applications are also contemplated. 

What is claimed is: 

1. A method of detecting a watermark in a multidimen- 
sional signal, the method comprising: 

estimating an initial orientation of a watermark signal in 
the multidimensional signal without reference to an 
original, un-watermarked version of the multidimen- 
sional signal; and 

refining the initial orientation to compute a refined 
orientation, including computing at least one orienta- 
tion parameter that: 

increases correlation between the watermark signal or 
watermark signal attribute and the multidimensional 

improves recovery of known watermark message bits 
from the watermark signal, when the watermark or 
multidimensional signal are adjusted with the refined 



2. The method of claim 1 including: 

estimating initial watermark candidates of the watermark 
signal in the multidimensional signal; 

refining the initial orientation candidates to compute 
refined orientation candidates, including for each can- 
didate: 

computing at least one orientation parameter that 
increases correlation between the watermark signal or 
watermark signal attributes and the multidimensional 
signal when the watermark or multidimensional signal 
are adjusted with the refined orientation candidate. 
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3. The method of claim 2 wherein: 

the initial and refined candidates are computed for por- 
tions of the multidimensional signal; 

and the refined candidates are fiirther refined by compar- 
ing similarity of orientation candidates from different ^ 
portions, and evaluating merits of the candidates based 
on similarity. 

4. The method of claim 3 wherein the portions are fiom 
a single image frame. 

5. The method of claim 3 wherein the portions are from 
different image &ames. 

6. The method of claim 1 including: 

evaluating a candidate by extracting watermark values 
fi^om the multidimensional signal and determining the 
extent to which the watermark values match expected 

7. A method of detecting a watermark in a target signal, 
the method comprising: 

computing orientation parameter candidates of a water- 20 
mark signal in different portions of the target signal; 

comparing similarity of orientation parameter candidates 
fi-om the different portions of the target signal; 

based at least in part on comparing the similarity of the 
orientation parameter candidates, determining an ori- 25 
entation of the watermarit in the target signal. 

8. The method of claim 7 wherein the portions of the 
target signal are from a single image firame. 

9. The method of claim 7 wherein portions of the target 
signal are from different frames. 3" 

10. A method of detecting a watermark in a target signal: 
estimating orientation of ttie watermark in the target 

signal; 

using the orientation to extract a measure of the water- 

mark in the target; and 
using the measure to assess merits of the estimated 



11. The method of claim 10 wherein the measure includes 
an extent to which values of image samples in the target 
signal are consistent with the watermark. 

12. The method of claim 10 wherein the measure includes 
an extent to which values of watermark message bits read 
from the target signal are consistent with expected bits. 

13. A watermark detector comprising: 

means for computing orientation of a watermark signal in 
a target signal without reference to an original, 
un-watermarked version of the target signal; 

means for adjusting at least portions of the target signal 
based on the orientation; and 

means for reading a message encoded in the watermark 
signal from the adjusted target signal portions. 

14. The detector of claim 13 wherein means for comput- 
ing the orientation includes: 

means for estimating orientation parameters; and 
means for refining the orientation parameters. 

15. The detector of claim 14 including: 

means for estimating orientation parameters from differ- 
ent portions of the target signal; 

means for grouping orientation parameters firom different 
portions of the target signal. 

16. The detector of claim 15 wherein the different portions 
are different spatial portions of the signal. 

17. The detector of claim 15 wherein the different portions 
are different temporal portions of the signal. 

18. The detector of claim 14 including: 

means for estimating rotation and scale parameters; 
means for refining the rotation and scale estimates; 
means for estimating translation parameters; and 
means for refining the translation parameters. 
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